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Attached is information about Release 2 of the Multics 
Sort/Merge* which is scheduled for Multics Release <+• 0 in June 
1976. There are four write-ups* including sort command* merge 
command* sort_ subroutine* merge_ subroutine* in the usual form 
for the Multics Programmers" Manual* and one write-up of 
additional interfaces to be documented in the PLM. 
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at Honeywell Bill erica by mail or phone; or via "mail Serson 
MSORT*" on either the MIT or Phoenix Multics systems. 
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1.1 A Merge, or file collation, function has been added. 

1.2 A subroutine interface for both the Sort and Merge has been 
added. 

1.3 Support for the SORT portion of the ANSI COBOL Sort/Merge 
Module, Level 2, has been added. (The C080L MERGE function 
is not supported by this package). 

l*k Additional data types for Keys and multiple key fields are 
supported. Release 1 supported only character string and a 
single key field. 

1.5 Additional storage media and file organizations are 
supported for the input and otuput files. Essentially any 
file can be supported which can be read or written 
sequentially via iox„_ using any available I/O module. 
Release l supported only sequential input and output files 
in the Multics storage system (using vfile_>. 

1.6 The following additional user exit points are provided! 

input_record exit* Permits the user to alter, delete, or 

insert records before they enter the 
sorting or merging process. 

output_record exit J Permits the user to alter, delete, 

Insert, or summarize records coming 
out of the sorting or merging process 
before they are written to the output 
file. 

1.7 Sequence checking for output records has been added. 

1.8 A file size argument has been added. 

1.9 Command arguments for measurement and testing have been 
added (-time, -mer ge_order , and -s tr ing_size> • 
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2. CHA|4Qgp__ SPEQtf ftQfliTIQNS 

The keyword -sort_desc (~sd) must precede the pathname of 
the Sort Description (when the Sort Description is supplied 
in a segment) • In Release 1* the pathname of the Sort 
Description must be the first argument and is not preceded 
by a Keyword. 

3 . QUESTIONS ABQgT QQCU^fcNTftTIQN 

I would like to raise the following questions about 
documentation of the Sort/Merge. 

3.1 Should the Sort and the Merge be documented in four separate 
MPM write-ups* as attached; or> should the Merge (command and 
subroutine) be documented in two shorter write-ups which 
then refer to the two Sort write-ups for details? There is 
much in common between the Sort and the Merge. On the other 
hand* noting differences applicable to the Merge in the Sort 
write-ups may be somewhat complicated and confusing. 

3.2 Should there be a separate Users' Guide for the Sort/Merge? 
If so* what information should go in the MPM and what in the 
Users' Guide? Some information not presently in the MPM 
write-ups which might go Into a Users* Guide is* 

text of error messages 

description of the report produced by the Sort/Merge 
(various counts of records processed* data produced by 
the -time argument) 

I/O usage; e.g. for PL/I I/O* Fortran* record_stream_* 
syn_* etc. 

Relationship between file size* work space required* 
optimization* etc. 

3.3 Should the additional command arguments described in the PLM 
write-up be documented directly in the MPM Commands 
write-ups? 
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Nana* sort 

The sort command provides a generalized file sorting 
capability, which is specialized for execution by user 
supplied parameters. The basic function of the Sort is to 
read one or more input files of records which are not 
orderedt sort those records according to the values of one 
or more Key fields* and write a single file of ordered (or 
"ranked") records. The Sort has the foH owing general 
capabi 1 it iest 

Input and output files may be on any storage medium and in 
any file organization? 

Very large files, such as multisegment files, can be sorted? 

Multiple Key fields and most PL/I string and numeric data 
types may be specified? 

Exits to user supplied subroutines are permitted at several 
points during the sorting process* 

In addition to arguments to the sort command, other 
information is necessary to specialize the Sort for a particular 
execution. This information, called the Sort Description, can be 
supplied either through the user's terminal or in a segment. 

The description given here of the sort command is sufficient 
for situations where the Sort is free standing? that is, where 
no user supplied procedures are executed. (User supplied 
procedures are called "exit procedures".) Additional information 
is necessary for executing the sort command with exit procedures, 
and is contained in the description of the sort_ subroutine in 
the Multics Programmers' Manual, Subroutines, Section II. 

INPUT AND OUTPUT 

The user can specify the Input and output files. In this 
environment, the Sort reads the input files and writes the output 
file. Each input or output file may be stored on any medium and 
in any file organization supported by an I/O module through iox_* 
The I/O module may be one of the Multics system I/O modules (such 
as tape_ansi_), or one supplied by a specific installation, or 
one written by a user. An input or output file is specified 
either by a pathname or by an attach description. 

Alternatively, the user can supply either an inout_fHe 
procedure or an output__file procedure (or both). An inout_file 
procedure is responsible for reading input and releasing records 
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to the Sort. An output_file procedure is responsible for 
retrieving records (ranked by the Sort) from the Sort and writing 
output. 

In all cases* records may be either fixed length or variable 
I ength. 



KEY FIELOS 

The user can specify the key fields to be used in ranking 
records. Key fields are described in the Keys statement of the 
Sort Description. Up to 20 key fields may be specified. Any 
PL/I string or numeric data type - except complex or pictured - 
may be specified for a given key field. Ranking may be 
ascending* descending* or mixed. For a character string field* 
the collating sequence is that of the Multics standard character 
set. 

Alternatively* the user can specify a user supplied compare 
procedure* which is then used to rank records. 

The original order of records with equal keys is preserved 
(FIFO order). Original input order is defined as to I lows t 

i. If two equal records come from different input files* then 
the record from the file which is specified earlier in the 
command line is first. 

2* If two equal records come from the same input filet then the 
record which is earlier in the file is first. 



EXITS 

The Sort provides exits to user supplied procedures at 
specific points during the sorting process. Exit procedures are 
named in the Exits statement of the Sort Description. The 
following exit points are provided* 

input_file To obtain input records and release them one 
by one to the sorting process. 

output_file To retrieve ranked records one by one from 
the sorting process and output them. 

input_record To perform special processing for each input 
record* such as deleting* inserting, or 
altering records to be input to the Sort. 
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output_record To perform special processing for each output 
record* such as deleting* inserting* or 
altering records to be output from the Sort? 
or summarizing data by accumulating it into a 
summary record. 

compare To compare two records? that is* to rank them 

for the sorting process. 
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sort -input.specs output_spec control _args 
wheret 

i. input_specs indicates that the user is specifying the 

input files. Up to 10 input files may be 
specified* Each input file specification 
(each input_spec) may be supplied in one of 
the following forms* 

-input_file pathname 

-if pathname If an input ffile is in the Mul tics 

storage system and its file organization 
is either sequential or indexed* then it 
may be specified by its pathname. The 
file may be either a single segment or a 
multisegment file. The star convention 
can not be used. 

An input file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". 

-input_descripf ion "at tach_desc" 

-ids "attach_desc*" If an input file is not in the Mul tics 

storage system or its file organization 
is neither sequential nor indexed* then 
it must be specified by an attach 
description. The attach description 
must be quoted. The target I/O module 
specified via the attach description 
must support the sequent! a l__i nput 
opening mode and the iox_ entry pcint 
read__record. 

Pathnames and attach descriptions can be 
intermixed in the input_specs argument. 

If the user is supplying an input_file exit 
procedure* then the input_specs argument must 
be omitted and the inpvjt_file exit procedure 
must be named in the Exits statement of the 
Sort Description. 
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output_spec indicates that the user is specifying the 

output file. Only one output file can be 
specified. The output file specification 
<outout_spec> may be supplied in one of the 
foil owing f ormsi 

-output_fi*e pathname 

-of pathname If the output file is in the Multics 

storage system and its file organization 
is sequential* then it may be specified 
by its pathname. The file may be either 
a single segment or a multisegment file. 

The equals convention may be used. If 
it is* it is applied to the pathname of 
the first input file and the first Input 
file must be specified by a pathname* 
not by an attach description. 



An output file specified by a pathname 
will be attached using the attach 
description "vfile.. pathname*". Thus if 
the file does not exist* it will be 
created. If it does exist* it will be 
overwritten. 



output_file -replace 

of -rp The output file is to replace the first 

input file. That input file will be 
overwritten during the merge phase of 
the Sort. If -replace is used* the 
first input file must be specified by a 
pathname* not by an attach description. 

output_descrip tion "attached esc" 

ods "at tach_de sc" If the output file is not in the Multics 

storage system or its file organization 
is not sequential* then it must be 
specified by an attach description. The 
attach description must be quoted. The 
target I/O module specified via the 
attach description must support the 
sequent ial.output opening mode and the 
iox_ entry point wr i te_record. 

If the user is supplying an output_file exit 
procedure* then the output_spec argument must 
be omitted and the output_file exit procedure 
must be named in the Exits statement of the 
Sort Description. 
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control_args 

-consol e_input 
-ci 



must be chosen from the following: 



indicates that the Sort Description is 
read via the I/O switch user_irput 
(which normally is the user's terminal). 



•sort_desc sd_path 
•sd sd_path 



indicates that the user is 
the pathname of the segment 
the Sort Description. 



specifying 
cont ai ring 



Either the -conso le_input or the -sort_desc 
argument - but not both - must be specified. 
See the heading Sort Description below. 



temp_dir td_path 
td td_path 



-fiie.size 1 



indicates that the user is specifying 
the pathname of the directory which will 
contain the Sort's work files. The 
equals convention can not be used. 

If this argument is omitted* work files 
will be contained in the user's process 
directory. 

This argument shojld be used when the 
process directory will not be large 
enough to contain the work files. The 
Cwd3 active function may be used for 
td__path to place work files in the 
user's current working directory. 

specifies that the total amount of data 
to be sorted is I millions of bytes. 
The argument 1 must be a decimal number. 
If the -fi le_size argument is omitted* 
the default assumption is approximately 
one million bytes <1 = i»0)« 

This argument is intended for use when 
some or all of the input files are not 
in the storage system (that is* are not 
specified by pathnames) or when an 
lnput_file exit procedure is used. In 
these cases the Sort cannot determine 
the amount of input data. (The Sort 
does compute the total amount of input 
data which is in the storage system* 
using segment bit counts.) The 
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-file_size argument may also be used 
when all of the input files are in the 
storage system but records are to be 
inserted or deleted through an 
input_record exit procedure. 

The -file_size argument is used for 
optimization of performance; the actual 
amount of input data can be considerably 
larger without preventing the Sort from 
completing. The maximum amount of data 
which can be sorted is (in bytes) 
approximately 60 million times the 
square root of !• 



NOTES 

Arguments can appear in any order, out a pathname or attach 
description must immediately follow its keyword. 

The temporary directory pathname <td_path) is the name of a 
directory. The Sort Description pathname (sd_path) is the name 
of a segment. 

Any pathname may be relative l to the user's current working 
directory) or absolute. 



Page iQ 



sort sort 



£o£l Qescrjot ion 

The Sort Description contains additional information to 
specialize the Sort for a particular execjtion. The information 
supplied may be: 

Keys - Description of one or more key fields used for 

ranking records- 
Exits - Specification of which exit points are to be used 

and the names of the corresponding user supplied 

exit procedures. 

A Sort Description is required. As a minimum* the user must 

specify how records are to be ranked* either by describing key 

fields in the Keys statement or by naming a compare exit 

procedure in the Exits statement. Other information in the Sort 
Description is opptional. 

The Sort Description may be supplied as a segment or read 
via the I/O switch user_input (normally the user's terminal). 

If the Sort Description is supplied In a segment* its 
pathname is specified In the -sort_desc argument. 

If the Sort Description is read via the user's terminal* 
the -conso I e_input argument is used. The Sort prints "Inputt" 
via the I/O switch user_output and waits for input. The user 
then types the Sort Description. To terminate the Sort 
Description* the user types a line consisting of a period (".") 
followed by a line feed. (This line is not part of the Sort 
Description. ) 



SYNTAX OF THE SORT DESCRIPTION 

A Sort Description consists of a set of statements. Each 
statement must begin with a function keyword* The function 
keyword is followed by the function keyword delimiter colon 
(*"i M >. The statement itself consists of one or more parameters* 
separated by parameter delimiters. The parameter delimiters are 
spaces* commas ("*")* or (in certain specific cases as specified 
below) parentheses ("(" and ")••). Each statement must end with 
the statement delimiter semicolon (";"). 

In the descriptions below* certain notational conventions 
are used. A word enclosed between the less than and greater than 
symbols (■"<" and ">") is a notational variable* which must be 
replaced by an actual word or phrase of the Sort Description 
language. A word not enclosed between < and > is an actual word 
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of the Sort Description language. A phrase enclosed between 
brackets ("['* and "]••) is optional. A phrase enclosed between 
braces ("C" and *"> M ) and followed by an ellipsis <**...") is 
required* and may be repeated one or more tines* 



KEYS STATEMENT 

The Keys statement specifies Key fields used to rank the 
records of the input files. The format of the Keys statement is* 

keys! C<key_descr ipt ion>> ... ? 

The Keys statement consists of a series of one or more 
<key_descr ipt ion>s. The key descriptions are specified in order, 
the first describing the major key and the last describing the 
most minor key. Up to 2a key descriptions may be supplied. 

A key description is the specification of a single key 
field. The format of a <key_descript ion> is* 

<datatype> (<size>) <position> {descending] 

wherei 

1. <datatype> is the data type of the key field. This 

element is required. See the table below for 
the encoding of <datatype>. 

2. <size> is the size of the key field, expressed in a 

form which depends on the data type. This 
element is required. 

For string data types, <slze> is the length 
(characters or bits) of the field. The 
length is the exact amount of space occupied 
by the field. 

For arithmetic data types, size is the 
precision (binary or decimal digits) of the 
field. Scale factor, if any, must not be 
written (it is not required by the Sort). 
The space occupied is determined by the 
precision in combination with the data type 
and the alignment. (Alignment is specified 
via <position>.) For an aligned binary field 
(fixed or floating), the space occupied is 
increased if necessary to an integral number 
of words. 
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<size> must be a decimal integer. The unit 
depends on the data type* See the table 
below for the semantics of <size>. (The 
rules used are the same as those usea by 
;1u1 tics PL/I.) 



3* <position> 



is the offset of the beginning of the Key 
field* relative to the beginning of the 
record. Consider the record as being aligned 
on a word boundary, as will be the case for a 
Hultics PL/I structure. This element is 
required. There are two formats* 



<w> 



where <w> is the word offset. Words are 
numbered from 0 for the first wora of 
the record. This format specifies to 
the Sort that the key field is aligned 
on a word or (if <w> is even) on a 
double word boundary. 



<w> (<b>) 



where <w> is the word portion of the 
offset and <b> is the bit portion of the 
offset; that is, the bit offset within 
the word. Sits are numbered from Q to 
35. This format implies that the key 
field is not aligned on a word boundary. 
If the key field is aligned on a word 
boundary but the user specifies a bit 
offset of o anyway, the Sort will 
operate correctly although speed of 
execution may be affected. 



The formats for <position> and the values for 
<«> and <b> are consistent with those shown 
in Hultics PL/I listings or used by debug. 



km descending 
dsc 



specifies descending order for ranking using 
this key field. This element may be omitted; 
the default is ascending order for this key 
field. 
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OATATYPE ENCODING ANO SEMANTICS OF SIZE 



Encoding I Semantics of <size> 

of 1 (where <size>=n> 

<datatype>l Unit Range Space Occupied 



Character string 
(Mul tics ASCII) 



char 



9 bit i - *f095 n characters 
character 



Bit string 



bit 



i bit 1 - **095 n bits 



Fixed binary 



bin 



1 bit 1-71 Aligned: 

1 < n < 351 one word 

36 < n < 71* two word 

Unaligned! n * l bits 



Floating binary 



float bin 1 bit 1-63 



At igned: 

1 £ n £ 271 one word 

36 < n < 631 two word 

Unaligned: n + 9 bit; 



Fixed decimal 
( I eaaing s ign) 



dec 



9 bit 
digit 



1-59 



n + 1 digits 



Float ing decima I 



float dec 



9 bit 
digit 



1-59 



n ♦ 2 digits 



In addition to the forms shown for <datatype> in the table 
above, the following variants are also permitted: 

The following alternate spellings may be used: 

char I character binlbinary decldecimal 

The word "fixed" may be used (or omitted)* For example: 
fixed binlbin fixed decldec 

The words may be written in any sequence* For example: 
float binlbin float 
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EXAMPLES OF KEY DESCRIPTIONS 



char(iQ), 0(18) 



Character string* Multics ASCII code, length 
ten characters; starts at bit 18 of word 0« 



char(8), i, descending 

Character string, Multics ASCII code, length 
eight characters? starts at bit o of word i? 
ranking is descending. 



character <<♦) , 2, dsc 

bit(i6), 0(2) 
bin(17), 2 



bin(i7), 2(18) 



bin(l), 2(0) 
bin(l), 2 

bin(36), 2 



dec(6), 0(9) 



float dec(9), 0(9) 



Character string, Multics ASCII code, length 
four characters; starts at bit 0 of word 21 
ranking is descending. 



Bit string, length 16 bits; 
of word 0. 



starts at bit 



Fixed binary, precision 17? since no bit 
offset is specified, is aligned and thus 
occupies one word (equivalent to "bin(35», 
2"*). 

Fixed binary, precision 17; since a bit 
offset is specified, is unaligned and 
occupies 18 bits, starts at bit i& of word 2 
(i.e., is in the low order half of word 2). 

Fixed binary, precision i; unaligned and thus 
occupies 2 bits? starts at bit 0 of word 2. 

Fixed binary, precision 1? aligned and thus 
occupies one word (equivalent to "bin<35), 
2-). 

Fixed binary, precision 36? since no bit 
offset is specified and precision is greater 
than 35 and word offset is even, is aligned 
and occupies two words (equivalent to 
rt bin(71), 2"). 

Fixed decimal, 9 bit digit, precision 6; 
starts at bit 9 of word 0 and occupies 7 
digits including sign (that is, through the 
end of word l). 

Floating decimal, 9 bit digit, precision 9? 
starts at bit 9 of word 0 and occupies 11 
digits including exponent and sign (that is, 
through the end of word 2>« 
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EXITS STATEMENT 

An Exits statement specifies the exit procedures to be used 
during execution of the Sort. The format of an Exits statement 
is i 

exitst €<exi t_descr iption>} •*• 5 

The Exits statement consists of a set of one or more 
<exi t_descript ion>s • Exit descriptions may be specified in any 
order. 

An exit description is the specification of one exit point 
and the user supplied exit procedure to be called at that exit 
point* The format of an < exi t_descr ipt ion> is* 

<exit_name> <user__name> 

where J 

i. <exit_narae> is the Keyword naming the exit point at which 

the user supplied exit procedure is to be 
called. Exit names may be chosen from the 
f o I I owing I is 1 1 

i nput_f i I e 
output_f i I e 
i nput_record 
output_record 
c ompare 

2* user_name is the name of the entry point of the user 

supplied procedure. This parameter has the 
same syntax and semantics as a command name. 
That isi 

User_ name can be either a segment name (e«g.« 
segment) or a segment name and an entry point 
name (e.g.. segment$ent ry_point ) • In these 
cases* the user's current search rules are 
applied to find the procedure. (If some 
segment is already Known by the specified 
reference name* that segment is used.) 

User_name can also be a pathname; that is. 
can specify a directory hierarchy location, 
either relative (to the user's current 
worKing directory) or absolute. In this 
case, the search rules are not applied and 
the pathname is used to find the procedure. 
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(If some other segment is already Known by 
the specified reference name* that segment is 
terminated first.) 



WRITING EXIT PROCEDURES 

The exit points to be used during an execution of the Sort 
and the names of the corresponding user supplied exit procedures 
are specified in the Exits statement as described above. The 
specifications for writing exit procedures (PL/I declare and call 
statements) and the functional requirements imposed upon exit 
procedures are given in the description of the sort., subroutine 
in Section II of MPM Subroutines* 
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sort -input_file sort.in -output_file = «out -consol e_input 
Input • 

key! char(iQ), o? 



In this example* the arguments of the command stste that 
there is one input file* whose pathname Is sort. in; the output 
file pathname is sort.out? the Sort Oescription is input via the 
user's terminal; and by default the work files are contained in 
the user's process directory. 

The Sort Description states that there is one key, a 
character string of length 10 characters, starting at word 0 bit 
0 of the record. There are no exits specified. 



sort -temp_dir >uad>pool -sort_desc sd 

In this example the arguments of the command state that the 
work files are contained In the directory >udd>pooi; and the 
Sort Oescription is contained in the segment named sd. 

Assume that the segment sd contains* 

keys: fixed bin(35) 0, char(8) i; 
exitsi input_file user$Input, 
output_file user$output; 

The Sort Description states that there are two keys. The 
major key is an aligned fixed binary field of precision 35, 
contained in word 0 of the record. The minor key is a character 
string of length 8, contained in words 1 and 2 of the record. 

There are two exits, an inpu t_f i 1 e procedure exit and an 
output_file procedure exit. The input_file exit procedure entry 
ooint is named userSinput; the output_file exit procedure entry 
point is named user$output. These exits must be specified 
because the command did not specify either an input file or an 
output file. 



sort -if sort__in -of -replace -td (wdl -sd sort_desc 

In this example the arguments of the command state that the 
input file is named sort_in? the output file is to replace the 
input file; work files are contained in the user's current 
working directory; and the Sort Oescription is contained in the 
segment sort_desc. 
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sort -Input _descript i on M tape_ansi_ vol_i -name a M -if b \ 
-output_descr ipt ion "vflle__ c -extend" -ci 

In this example there are two input files. The first input 
file is specified by an attach descriotion for the I/O module 
tape_ansi_ with the attach argument "vol_i -name a". The second 
input file is specified by the pathname b. and thus must be a 
sequential or indexed file in the storage system* The output 
file is specified by an attach description for the I/O module 
v f i I e_ with the attach argument "b -extend*"* For the I/O module 
vfiie_» this means that the patrname is c and the file is to be 
extended; that is* output records from the Sort will be written 
at the end of the file c (if it already exists). 

{A \ followed by a line feed is used to continue the command 
arguments onto the second line.) 

The Sort Description (not shown) will be read via the user's 
terminal • 



sort -ids *"r ecord_stream_ -target vfile_ a" -of b -ci 

In this example assume that the input file is an 
unstructured file in the storage system, with the pathname a. 
The Input file has been specified by an attach description using 
the I/O module record_stream_. which will transform the record 
I/O operations requested by the Sort into the appropriate stream 
I/O operations for the target file a. 



sort -ids "syn_ user_switchrame" -of b -ci 

In this example the input file is attached using the I/O 
module syn_ to the I/O switch user_swi tc hname • which must be 
attached and closed. 
Njame I merge 

The merge command provides a generalized file merging 
capability, which is specialized for execution by user supplied 
parameters. The basic function of the Merge is to reac one or 
more input files of records which are In order according to the 
values of one or more key fields, merge (collate) those records 
according to the values of those Key fielas. and write a single 
file of ordered (or "ranked"*) records. The Merge has the 
following general capabl I i t lest 
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Input and output files may be on any storage medium and in 

3 n y file organization? 

Very large files* such as multisegment files* can be merged* 

Multiple Hey fields and most PL/I string and numeric data 
types may be specified; 

Exits to user supplied subroutines are permitted at several 
points during tne merging process. 

In addition to arguments to the merge command* other 
information is necessary to specialize the Merge for a particular 
execution. This information* called the Merge Description* can 
be supplied either through the user's terminal or in a segment. 

The description given here of the merge command Is 
sufficient for situations where the Merge is free standing; that 
is f where no user supplied procedures are executed. {User 
supplied procedures are called "exit procedures".) Additional 
information is necessary for executing the merge command with 
exit procedures* and is contained in the description of the 
merge. subroutine in the Multlcs Programmers* Manual, 
Subroutines* Section II. 



INPUT AND OUTPUT 

The user specifies the input and output files. The Merge 
reads tne input files and writes the output file. Each Input or 
output file may be stored on any medium and in any file 
organization supported by an I/O module through Iox_. The I/O 
module may be one of the Muitics system I/O modules (such as 
tape_ansi_) * or one supplied by a specific installation* or one 
written by a user. An input or output file is specified either 
by a pathname or by an attach description. 

In all cases* records may be either fixed length or variable 
I en gth. 



KEY FIELDS 

The user can specify the key fields to be used in ranking 
records. Key fields are described In the Keys statement of the 
Merge Description. Up to 20 key fields may be specified. Any 
PL/I string or numeric data type - except complex or pictured - 
may be specified for a given key field. Ranking may be 
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ascending* descending* or mixed* For a character string field, 
the collating sequence is that of the Multics standard character 
set- The records of each input fiie must be in order according 
to those Key fields. 

A I ternati vely • the user can specify a user supplied compare 
procedure* which is then used to rank records. The records of 
each input file must be in order according to the algorithm of 
that procedure. 

The original order of records with equal keys is preserved 
<FIFQ order). Original input order is defined as follows* 

1. If two equal records come from different input files. then 
the record from the file which is specified earlier in the 
command line is first. 

2. If two equal records come from the same Input file* then the 
record which is earlier in the file is first. 



EXITS 

The Merge provides exits to user supplied procedures at 
specific points during the merging process. Exit procedures are 
named in the Exits statement of the Merge Description. The 
following exit points are provided! 

output_record To perform special processing for each output 
record* such as deleting* inserting* or 
altering records to be output from the Merge; 
or summarizing data by accumulating it into a 
summary record. 

compare To compare two records; that is* to rank them 

for the merging process. 
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merge input_specs output_spec control _args 
where* 

1. input_specs indicates that the user is specifying the 

input files* Up to iO input files may be 
specified. Each input file specification 
(each input_spec) may be supplied in one of 
the following forms! 

-input_file pathname 

-if pathname If an input file is in the Multics 

storage system and its file organization 
is either sequential or indexed* then it 
may be specified by its pathname. The 
file may be either a single segment or a 
multisegment file. The star convention 
can not be used. 

An input file specified by a pathname 
Mill be attached using the attach 
description **vfile_ pathname**. 

-inpu t_descr iot ion "at tach_desc M 

-ias "at tach_de sc M If an input file Is not in the Multics 

storage system or its file organization 
is neither sequential nor indexed* then 
it must be specified by an attach 
description. The attach description 
must be quoted. The target I/O mocule 
specified via the attach description 
must support the sequenti a l_l nput 
opening mode and the iox_ entry point 
r ead_record. 

Pathnames and attach descriptions can be 
intermixed in the input_specs argument. 

2. output_spec indicates that the user is specifying the 

output file. Only one output file can be 
specified. The output file specification 
(outpu t_spec ) may be suplied in one of the 
f ol I owing forms s 

-outpu t_f 1 1 e pathname 

-of pathname If the output file is in the Multics 
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storage system and its file organization 
is sequential* then it may be specified 
by its pathname. The file may be either 
a single segment or a mu I ti segment file. 

The equals convention may be used. If 
it is, it is applied to the pathname of 
the first input file and the first input 
file must be specified by a pathname* 
not by an attach description. 

An output file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". Thus if 
the file does not exist, it will be 
created. If it does exist, it will be 
overwri tter. 



-output_description 
-ods "at tach_de sc*" 



'attach_desc M 

If the output file is not in the Multics 
storage system or its file organization 
Is not sequential, then it must be 
specified by an attach description. The 
attach description must be quoted. The 
target I/O module specified via the 
attach description must support the 
sequential_outout opening mode and the 
iox_ entry point wri te_record. 



3. control_args 



must be chosen from the following! 



•conso I e_input 
>ci 



indicates that tha Merge Description is 
read via the I/O switch user_input 
(which normally is the user's terminal). 



•merge_desc md_path 

•md md_path indicates that the user is specifying 

the pathname of the segment containing 
the Merge Description. 

Either the -conso le_input or the -merge_aesc 
argument - but not both - must be specified. 
See the heading Merge Description below. 
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NOTES 

Arguments can appear in any order f but a pathname or attach 
description must immediately follow its keyword. 

The Merge Description pathname (md_path) is the name of a 
segment • 

Any pathname may be relative (to the user - s current working 
directory) or absolute. 
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The Merge Description contains additional information to 
specialize the Merge for a particular execution. The information 
supplied may be: 

Keys - Description of one or more key fields used for 
ranking records. 

Exits - Specification of which exit points are to be used 
and the names of the corresponding user supplied 
exit procedures. 

A Merge Description is required* As a minimum* the user 
must specify how records are to be ranked* either by describing 
key fields in the Keys statement or by naming a compare exit 
procedure in the Exits statement. Other information in the Merge 
Description is optional. 

The Merge Description may be supplied as a segment or read 
via the I/O switch user_input (normally the user's terminal). 

If the Merge Description is supplied in a segment, its 
pathname is specified in the -merge_desc argument. 

If the Merge Description is read via the user's terminal* 
the -conso I e_input argument is used. The Merge prints "Input : " 
via the I/O switch user_output and waits for input. The user 
then types the Merge Description. To terminate the Mer§e 
Description, the user types a line consisting of a period ("*•") 
followed by a line feeo. (This line is not part of the Merge 
Description.) 



SYNTAX OF THE MERGE DESCRIPTION 

A Merge Description consists of a set of statements. Each 
statement must begin with a function keyword. The function 
keyword is followed by the function keyword delimiter colon 
< " J " ) . The statement itself consists of one or more parameters, 
separated by parameter delimiters. The parameter delimiters are 
spaces, commas (*V)* or (in certain specific cases as specified 
below) parentheses ("(" and ")•")• Each statement must end with 
the statement delimiter semicolon ("; M ). 

In the descriptions below, certain notational conventions 
are used. A word enclosed between the less than and greater than 
symbols ("< M and ">") is a notational variable, which must be 
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replaced by an actual word or phrase of the Merge Description 
language* A word not enclosed between < and > is an actual word 
of the Merge Description language. A phrase enclosed between 
brackets ("t M and "]") is optional. A phrase enclosed between 
braces ("C" and ">") and followed by an ellipsis ("...") is 
required* and may be repeated one or more times. 



KEYS STATEMENT 

The Keys statement specifies Key fields used to rank the 
records of the input files. The format of the Keys statement is? 

keys* C<key_description>> ... J 

The Keys statement consists of a series of one or more 
<key_descript ion>s. The key descriptions are specified in order, 
the first describing the major key and the last describing the 
most minor key. Up to 20 key descriptions nay be supplied. 

A key description is the specification of a single key 
field. The format of a <key_descripf ion> 1st 

<datatype> (<size>) <position> (descending! 
where* 

1. <datatype> is the data type of the key field. This 

element is required. See the table below for 
the encoding of <datatype>. 

2. <size> is the size of the key field. This element 

is required. 

For string data types* <size> is the length 
(characters or bits) of the field. The 
length is the exact amount of space occupied 
by the field. 

For arithmetic data types* <size> is the 
precision (binary or decimal digits) of the 
field. Scale factor* if any* must not be 
written (it is not required by the Merge). 
The space occupied Is determined by the 
precision in combination with the data type 
and the alignment. (Alignment is specified 
via <position>.) For an aligned binary field 
(fixed or floating)* the space occupied is 
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increased if necessary to an integral number 
of words. 



<size> must be a decimal integer. The unit 
depends on the data type. See the table 
below for the semantics of <size>. (The 
rules used are the same as those used by 
Mul tics PL/I.) 

3. <position> is the offset of the beginning of the key 

field* relative to the beginning of the 
record* Consider the record as being aligned 
on a word boundary, as wi I I be the case for a 
Hul tics PL/I structure. This element is 
required. There are two formats! 



<w> where <w> is the word offset. Words 

are numbered from 0 for the first word 
of the record. This format specifies to 
the Merge that the Key field is aligned 
on a word or I if <w> is even) on a 
double word boundary* 

<w> (<b>) where <w> is the word portion of the 

offset and <b> is the bit portion of the 
offset; that is* the bit offset within 
the word. Bits are numbered from fl to 
35. This format implies that the Key 
field is not aligned on a word boundary. 
If the Key field is aligned on a word 
boundary but the user specifies a bit 
offset of 0 anyway* the Merge will 
operate correctly although speed of 
execution may be affected. 

The formats for <position> and the values for 
<w> and <b> are consistent with those shown 
in Multics PL/I listings or used by debug. 

<♦* descending specifies descending order for ranking using 

asc this Key field. This element may be omitted? 

the default is ascending order for this Key 
field. 
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DATATYPE ENCODING AND SEMANTICS OF SIZE 



Character string 
(Hultics ASCII) 



Encoding 1 Semantics of <size> 
of 1 (where <slze> = n) 

<datatype>! Unit Range Space Occupied 



char 



9 bit i - **095 
character 



n characters 



Bit string 



bit 



1 bit 1 - *f095 n bits 



Fixed binary 



bin 



1 bit 1 - 71 



A I ignedi 

i 5 n < 35s one wore 

36 < n ,< 7±t two wore 

Unaligned! n + i bits 



Floating binary 



float bin i bit i - 63 



A 1 1 gnedi 

1 < n < 271 one wore 

36 < n < 63! two ore 

Unaligned! n * 9 s 



Fixed decimal 
( leading sign) 



dec 



9 bit 
digit 



1-59 



n + 1 digits 



Floating decimal 



float dec 



9 bit 
digit 



1-59 n + 2 digits 



In addition to the forms shown for <datatype> in the table 
above* the following variants are also permitted! 

The following alternate spellings may be used! 

char J char acter binlbinary decldecimal 

The word "fixed" may be used (or omitted). For example! 

fixed binJoin fixed decldec 

The words may be written in any sequence* For example! 
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EXAMPLES OF KEY OE SCRIPT IONS 



char(lG), 0(18) 



Character string* Multics ASCII code, length 
ten characters? starts at bit 18 of word Q • 



char(8), fl, descending 

Character string, Multics ASCII code, length 
eight characters; starts at bit Q of word 0? 
ranking is descending* 



character ( k) , 0» dsc 



bit (16), 0(2) 



Character string, Multics ASCII code, length 
four characters? starts at bit 0 of word 0? 
ranking is descending. 



3it string, length 16 bits? 
of word 0. 



starts at bit 



bin(i7), 2 



Fixed binary, precision 17? since no bit 
offset is specified, is aligned and thus 
occupies one word (equivalent to M bin(35), 
2*M . 



bind/), 2(18) 



bind), 2(0) 



Fixed binary, precision 17? since a bit 
offset is specified, is unaligned and 
occupies 18 bits? starts at bit 18 of word 2 
(i.e., is in the low order half of word 2). 

Fixed binary, precision l? unaligned and thus 
occupies 2 bits? starts at bit 0 of word 2» 



bin(l), 2 



Fixed binary, precision l? aligned and thus 
occupies one word (equivalent to "bin(35), 
2") . 



bin(36), 2 



dec(6), 0(9) 



Fixed binary, precision 35? since no bit 
offset is specified and precision is greater 
than 35 and word offset Is even, is aligned 
and occupies two words (equivalent to 
"bin(7l>, 2"). 

Fixed decimal, 9 bit digit, precision 6? 
starts at bit 9 of word 0 and occupies 7 
digits including sign (that is, through the 
end of word 1). 



float dec(9), fl(9) Floating decimal, 9 bit digit, precision 9? 

starts at bit 9 of word 0 and occupies 11 
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digits including exponent and sign (that is* 
through the end of word 2). 
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EXITS STATEMENT 

An Exits statement specifies the exit procedures to be used 
during execution of the Merge. The format of an Exits statement 
is* 

exits! £<exi t.descr i ption>> ... ; 

The Exits statement consists of a set of one or more 
<exi t_descr ipt ion>s • Exit descriptions may be specified in any 
order* 

An exit description is the specification of one exit point 
and the user supplied exit procedure to be called at that exit 
point. The format of an <exit_descr i pt ion> ist 

<exit_name> <user_name> 
where! 

1. <exit_name> is the keyword naming the exit point at which 

the user supplied exit procedure is to be 
called. Exit names may be chosen from the 
foil owing I ist t 

output_recor d 
compare 

2. user_name is the name of the entry point of the user 

supplied procedure. This parameter has the 
same syntax and semantics as a command name. 
That isi 

User_name can be either a segment name (e.g.. 
segment) or a segment name and an entry point 
name (e.g.. segmentSent ry_point ) . In these 
cases, the user's current search rules are 
applied to find the procedure. (If some 
segment is already known by the specified 
reference name* that segment is used.) 

User_name can also be a pathname? that is* 
can specify a directory hierarchy location, 
either relative (to the user's current 
working directory) or absolute. In this 
case, the search rjles are not applied and 
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the pathname is used to find the procedure. 
(If some other segment is already known by 
the specified reference name* that segment is 
terminated first.) 



WRITING EXIT PROCEDURES 

The exit points to be used during an execution of the Merge 
and the names of the corresponding user supplied exit procedures 
are specified in the Exits statement as described above. The 
specifications for writing exit procedures (PL/I declare and call 
statements) and the functional requirements imposed upon exit 
procedures are given In the description of the merge, subroutine 
in Section II of MPM Subroutines. 
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merge -if merge.in_i -if merge.in_2 -output_file = . out -c i 
Input . 

keyi char(iQ), 0? 



In this example, the arguments of the command state that 
there are two input files, whose pathnames are merge. in_i and 
merge. in_2? the output file pathname is raerge.out? and the 
Merge Description is input via the user's terminal. 

The Merge Description states that there is one key, a 
character string of length 10 characters, starting at word 0 bit 
0 of the record. There are no exits specified. 



merge -input_fiie in_i -if in_2 -of out_i -merge_desc md 

In this example, the arguments of the command state that the 
input files are named in_l and in_2» the output file is named 
out_i; and the Merge Description Is contained in the segment 
named md. 

Assume that the segment md contains! 

keys* fixed bin(35) 0, char(8) t; 
exits? output_record user$output? 

The Merge Description states that there are two keys. The 
major key is an aligned fixed binary field of precision 35, 
contained in word 0 of the record. The minor key is a character 
string of length 8, contained in words 1 and 2 of the record. 

There is one exit, an output_r ecor d procedure exit? the 
output_record exit procedure entry point is named userSoutput. 



merge - input_descript ion "tape_ansi_ vol_i -name a" -if b \ 
-output_descr ip t ion "vfile_ c -extend" -ci 

In this example, there are two input files. The first Input 

file is specified by an attach description for the I/O module 

tape_ansi_ with the attach argument "vol_i -name a". The second 

input file is specified by the pathname b, and thus must be a 

sequential or indexed file in the storage system. The output 

file is specified by an attach description for the I/O module 
vfile_ with the attach argument "c -extend". For the I/O module 
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vfile., this means that the pathname is c and the file is to be 
extended; that ist output records from the Merge will be written 
at the end of the file c (if it already exists}* 

<A \ followed by a line feed is used to continue the command 
arguments onto the second line.) 

The Merge Description (not shown) will be read from the 
user's terminal* 



merge -ids w r ecord.stream. -target vfHe. a" \ 

-ids "*syn. user.swi tchname" 1 -of c -conso I e.inou t 

In this example, assume that the first input file is an 
unstructured file in the storage system, with the pathname a* 
This input file has been specified by an attach description using 
the I/O module record.str earn., which will transform the record 
I/O operations requested by the Merge into the appropriate stream 
I/O operations for the target file a* The second input file is 
attached using the I/O module syn. to the I/O switch 
user.switchname, which must be attached and closed* 
Nam e t sort. 

The sort, subroutine provides a generalized file sorting 
capability, which is specialized for execution by user supplied 
parameters* The basic function of sort, is to read one or more 
input files of records which are not ordered, sort those records 
according to the values of one or more Key fields, and write a 
single output file of ordered (or "ranked") records* The sort, 
subroutine has the following general capao i I i ties* 

Input and output files may be on any storage medium ana in 
any file organization? 

Very large files, such as multisegment files, can be sorted; 

Multiple key fields and most PL/I string and numeric aata 
types may be specified? 

Exits to user supplied subroutines are permitted at several 
points during the sorting process* 

The arguments to the sort, subroutine include one or more 
pointers to additional information necessary to specialize sort, 
for execution. This additional Information is called the Sort 
Description. 
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INPUT AND OUTPUT 

The user can specify the input and output files. In this 
environment the Sort reads the input files and writes the output 
file. Each input or output file may be stored on any mediuff and 
in any file organization supported by an I/O module through i ox_. 
The I/O module may be one of the Multics system I/O modules (such 
as tape_ansi_), or one supplied by a specific installation, or 
one written by a user. An input or output file is specified 
either by a pathname or by an attach description. 

A J ternati ve ly , the user can supply either an inout_file 
procedure or an output_.fi le procedure (or 30th). An inpu1_file 
procedure is responsible for reading input and releasing records 
to the Sort. An output_.fi le procedure is responsible for 
retrieving records (ranked by the Sort) from the Sort and writing 
output • 

In all cases, records may be either fixed length or variable 
I ength. 



KEY FIELOS 

The user can specify the key fields to be used in ranking 
records. Key fields are described in the Keys statement - or in 
the keys structure - of the Sort Description. Up to 20 key 
fields may be specified. Any PL/I string or numeric data type 
except complex or pictured - may be specified for a given key 
field. Ranking may Oe ascending, descending, or mixed. For a 
character string key field, the collating sequence Is that of the 
Multics standard character set. 

Alternatively, the user can supply a compare procedure, 
which is then used to rank records. 

The original inout order of records with equal keys is 
preserved (FIFO order). Original input order is defined as 
foil ows ! 

1. If two equal records come from different input files, then 
the record from the file which is specified earlier in the 
list of input files (in the input_specs subroutine argument) 
is first. 

2. If two equal records come from the same inout file, then the 
record which is earlier in the file is first. 
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EXITS 



The Sort provides exits to user sjpptied procedures at 
specific points during the sorting orocess. Exit procedures are 
named in the Exits statement - or in the exits and io_exits 
structures - of the Sort Description. The following exit points 
are provided! 



input_f i I e 
output_f i I e 
input_record 

output_record 



compare 



To obtain input records and release them one 
by one to the sorting process. 

To retrieve ranked records one by one from 
the sorting process and output them. 

To perform special processing for each input 
record, such as deleting, inserting* or 
altering records to be input to the Sort. 

To perform special processing for each output 
record, such as deleting, inserting, or 
altering records to be output from the Sort; 
or summarizing data by accumulating it into a 
summary record* 

To compare two records? that is, to rank them 
for the sorting process. 



Oetails of exit procedures are given below under the heading 
Writing Exit Procedures. 
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del sort_ entry <( *)char<*) , chart*)* (*)ptr* char(*)* 

char<*)* float bin(27), fixed bin<35))? 

call sort_ ( input_specs* output_spec* sort_desc* temp_dir» 

user_out_sw* fi le_size» code!* 

where i 



1. input_specs is an array containing the specifications of 

the input files. Up to iO input files may be 
specified. The array extent specifies the 
number of input files. (Input) 

Input file J Is specified in the array 
element input_specs( ] ) * in one of the 
f o I I owing forms t 

-input_file pathname 

-if pathname If an Input file is in the Multics 

storage system and its file organization 
is either sequential or indexed* then it 

may be specified by its pathname. The 

file may be either a single segment or a 

multisegment file. The star convention 
can not be used. 



An input file specified by a pathname 
will be attached using the attach 
description "vfile. pathname". 



-input_descr iot ion attach_desc 

-ids attach_desc If an input file Is not in the Multics 

storage system or Its file organization 
is neither sequential nor indexed* then 
it must be specified by an attach 
description. The target I/o module 
specified via the attach description 
must support the sequent i a l_I nput 
opening mode and the lox_ ertry point 
read_record. 

Pathnames and attach descriptions can be 
intermixed in the input_specs array. 



If the user is supplying an input_file exit 
procedure* then input_specs <l) * the first 
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input file specification, must be (the 
array extent should be i) and the inout^file 
exit procedure must be named in the io_exits 
structure of the Sort Description. 

2. output_spec is the specification of the output file. 

Only one output file may be specified. 
(Input) 

The output file may be soecified in one of 
the following forms? 

-output_f i le pathname 

-of pathname If the output file is in the Multics 

storage system and its file organization 
is sequential, then it may be specified 
by its pathname. The file may be either 
a single segment or a mu I ti segment file* 

The equals convention can be used. If 
it is, it is applied -to the pathname of 
the first input file and the first input 
file must be specified by a pathname, 
not by an attach description. 

An output file specified by a pathname 
will be attached using the attach 
description "vfile_ pathname". Thus if 
the file does not exist, it will be 
created. If it does exist, it will be 
overwri tten. 

-output_file -replace 

-of -rp The output file is to replace the first 

input file. That input file will be 
overwritten during the merge phase of 
the Sort. If -replace is used, the 
first input file must be specified by a 
pathname, not by an attach description. 

-output_descrip tion attach_desc 

-ods attach_desc If the output file is not in the Multics 

storage system or its file organization 
is not sequential, then it must be 
specified by an attach description. The 
target I/O module specified via the 
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attach description must support the 
sequent i a! — output opening ™ode and the 
iox_ entry point wr i te_record. 

If the user is supplying an output_file exit 
procedure* then the output_spec argument must 
be MM and the output_file exit procedure must 
be named in the io_exits structure of the 
Sort Description. 



3. sort_desc 



is an array of pointers to the Sort 
Description. See the heading Sort 

Description below* (Input) 



<t. temp_dir 



is the pathname of the directory which 
contain the Sort's work files* (Input) 



wi I I 



If this argument is "% then work files will 
be contained in the user's process directory* 

This argument should be used when the process 
directory will not be large enough to contain 
the work files* The ge t_wdir_ function may 
be used to obtain the name of the user's 
current working directory. 



5. user_out_sw 



specifies the destination of both the summary 
report and diagnostic messages for errors 
detected in the arguments to sort_ or in the 
Sort Description. (Input) 

This argument may have the following values* 

= write the summary report and 
diagnostic messages via the I/O 
switch user_output* 



'-bf 



= do not write the summary report 
and diagnostic messages. If any 
errors are diagnosed. sort_ will 
return with the status code 
bad_arg but information about 
the number and nature of the 
errors is not available. 



switchname - write the summary report and 
diagnostic messages via the I/O 
switch named switchname. The 
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switch must be attached and open 
for stream output. 

6. fiie_size is the total amount of data to be sortedt in 

millions of bytes. If this argument is zero, 
the default assumption is approximately one 
million bytes (file_size = i.O). (Input) 

This argument is intended for use when some 
or all of the input files are not in the 
storage system (that is, are not specified by 
pathnames} or when an input_file exit 
procedure is used. In these cases the Sort 
cannot determine the amount of input data. 
(The Sort does compute the total amount of 
input data which is in the storage system, 
using segment bit counts.) The file_size 
argument may also be used when all of the 
input files are in the storage system but 
records are to be Inserted or deleted through 
an input_record exit procedure. 

The file_size argument is used for 
optimization of performance? the actual 
amount of data can be considerably larger 
without preventing the Sort from completing. 
The maximum amount of data which can be 
sorted is (in bytes) approximately 6Q million 
times the square root of file_size. 

7m code is a standard Multics status code returnee by 

sort_. Possible values are listed below 
under the heading Status Codes. (Output) 



NOTES 

The temporary directory pathname (temp_dir argument) is the 
name of a directory. 

Any pathname may be relative (to the user's current working 
directory) or absolute. 



STATUS COOES 

The following status codes may be returned by sort. (all 
codes are in error_tab I e_) » 
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Normal return (no errors). 



bad.ar g 



One or more arguments specified to sort.* 
including those in the Sort Description* was 
invalid or inconsistent. The Sort will rave 
previously written diagnostic messages as 
directed by the user.out.sw argument. The 
sorting process itself has not been started. 



fatal .error 



The Sort has encountered a fatal error during 
the sorting process. The Sort will have 
previously generated a specific error message 
and signalled the sub.error. condition via 
the sub.err. subroutine. 



ou t.of .sequence 



The call to sort, is not in the sequence 
required by the Sort* that is* sort, has 
been called after initiation of the Sort but 
before termination of that invocation. 
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The Sort Description contains additional information to 
specialize the Sort for a particular execution. The Sort 
Description is specified via the sort_desc argument to sort_. 
The information specified may be* 

Keys - Description of one or more Key fields used for 
ranking records. 

Exits - Specification of which exit points are to be used 
ana the names of the corresponding user supplied 
exit procedures. 

A Sort Description is required. As a minimum* the user must 

specify how records are to be ranked* either by describing key 

fields in the Keys statement or by naming a compare exit 

procedure in the Exits statement. Other information in the Sort 
Description is optional. 

The Sort Description may be supplied to sort_ in either of 
two forms* called source form and internal form. 

The source form of the Sort Description is written exactly 
as specified for the sort command <see the Multics Programmers" 
Manual* Commands and Active Functions* Section III)* and is 
stored as an ASCII segment? that is* as an unstructured file in 
the Multics storage system* If source form is used* then the 
sort_desc argument to sort_ must have an array extent of i and 
the one pointer must be a pointer to the segment. (The segment 
must contain only the Sort Description*) The source form is 
useful when the user writes the Sort Description and supplies it 
to the procedure which calls sort_. 

The internal form of the Sort Description is a set of one to 
three structures* The sort_desc argument must have an array 
extent of 3* and the three pointers are pointers to the three 
structures. Any of the structures can be omitted; in that case 
the corresponding pointer must be null. The pointers must be 
specified in the array in the following order* 

addr (keys) 
addr (exits) 
addr ( io_exits> 

where the three structures (keys* exits* and io_exits) are 
defined below. The internal form is useful when the proceaure 
calling sort_ constructs the Sort Description. 
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KEYS STRUCTURE 

The keys structure is used when the caller describes Key 
fields. The Sort's standard compare routine will then be used to 
rank records. If the caller describes keys* then the compare 
exit must not be specified. 

If the caller does not describe keys* then the corresponding 
pointer in the array sort_desc must be nul J and the compare exit 
must be specified in the exits structure. The user supplied 
compare routine will then be used to rank records. 

The keys structure is? 

del l keys, 

2 version fixed bin init(l), 

2 number fixed bin, 

2 key_desc (user_keys_number refer(keys. number) ) , 

3 datatype char(8), 

3 size fixed bin<2<f), 

3 word_offset fixed bin<i8), 

3 bit_offset fixed bin<6), 

3 desc char (3) « 

where* 



1. version 

2. number 

3. key_desc 



is the version number of the structure 
be D . 



( must 



is the number of key fields, established 
the value of user_keys_number • 



by 



is an array of key descriptions. Each key 
description is one element of the array. The 
key descriptions must be specified in order, 
the major key first and the most minor key 
I ast • 



datatype 



is the data type of the key field. See the 
table below for the encoding of datatype. 
The value must be left Justified within 
datatype. 



5. size 



is the size of the key field, 
depend on the data type. 



in units which 



For string data types f size is the exact 
length (characters orbits) of the field. 
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For arithmetic data types* size is the 
precision (binary or decimal digits) of the 
field. The space occupied is determined by 
precision in combination with the data type. 
The space occupied is not adjusted for an 
aligned field. For example* for an aligned 
fixed binary field of one word* size must be 
specified as 35? for an aligned floating 
binary field of two words, size must be 
specified as 63. See the table below for the 
semantics of size. 

6. word_offset is the word portion of the offset of the 

beginning of the Key field* relative to the 
beginning of the record. Consider the record 
as being aligned on a word boundary* as will 
be the case for a Multics PL/I structure. 
Words are numbered from 0 for the first word 
of the record. 

7* bit_offset is the bit portion of the offset of the Key 

field? that is, the bit offset within the 
word in which the key field begins. Bits are 
numbered from o to 35. (If the field is 
aligned on a word boundary* then bit_offset 
is 0.) 

8. desc indicates whether ranking for this key field 

is to be ascending or descending. Possible 
values are* 

= use ascending ranking. 

"dsc*" = use descending ranking. 
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OATATYPE ENCOOING AND SEMANTICS OF SIZE 



Encoding I Semantics of size 

of I (where size = n) 

datatype 1 Unit Range Space Occupied 



Character string 
(Mul tics ASCII) 



char 



9 bit I - **095 n characters 
character 



Bit string 



bit 



i bit i - *f095 n bits 



Fixed binary 



bin 



i bit i - 71 n ♦ 1 bits 



Floating binary 



f Ibin 



1 bit i - 63 n <• 9 bits 



Fixed decimal dec 
( I eading sign) 



9 bit 
digit 



i - 59 n ♦ 1 digits 



Floating decimal fldec 



9 bit 
digit 



i - 59 n ♦ 2 digits 
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EXITS STRUCTURE 

The exits structure 1st 



del i exits* 

1 version 

2 compare 

2 input_record 
2 output_record 



fixed bin init(l), 

entry, 

entry, 

entry? 



wheret 



version 



compare 



is the version number of the structure 
be l) - 



(must 



specifies the entry point of a user supplied 
compare exit procedure* If the caller 
describes key fields (supplies a keys 
structure), then this exit must not be 
spec i f led* 



3* input_record 



4* output_record 



specifies the entry point of a user supplied 
input_record exit procedure. This exit can 
be specified whether or not the input_file 
exit is specified* 

specifies the entry point of a user supplied 
output_record exit procedure* This exit can 
be specified whether or npt the output_file 
exit is specified. 



IO_£XITS STRUCTURE 

The io_exlts structure 1st 



del i io_exits, 
2 version 
2 input.fi le 
2 output_fi!e 

wheres 

1* version 



fixed bin init(i), 

entry, 

entry? 



is the version number of the structure 
be 1) . 



< must 
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input_f i I e 



specifies the entry point of a user supplied 
ir.pu t_ file exit procedure. If the caller 
names input files* then this exit must not be 
speci f ied. 



3. 



output__f i I e 



specifies the entry point of a user supplied 
output_flle exit procedure. If the caller 
names the output file* then this exit must 
not be specified. 



ENTRY VARIABLES 

In the exits and io_exits structures* each exit point is 
specified via an entry variable. The entry variable must be set 
(either initialized or assigned) by a user procedure* normally 
the procedure which calls sort_. The entry variable can identify 
either an internal entry point (that is* an internal procedure) 
or an external entry point of the procedure which sets the entry 
variable; or it can identify an external entry point of another 
user procedure. 

If none of the exits declared in either the exits or 
io_exits structure is to be used* then that structure can be 
omitted and the corresponding pointer in the array sort_desc must 
be null. If the structure is Included but an exit specified in 
it is not to be used, then the corresponding entry variable must 
be set to sort_$noe >I t * which is declaredi 

del sort_$noexit entry external? 

An exit point may not be altered after the call to sort_. 
Any change to the entry variable thereafter will have no effect. 
However, certain entry points can be disabled* as specified in 
the descriptions of the Individual exit procedures below. 
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A user supplied exit procedure is called by the Sort to 
perform a specified function. The user procedure must perform 
that function, and then must return to the Sort* The user exit 
procedure may perform additional functions desired by the user. 

Certain exit procedures replace the corresponding standard 
routine of the Sort* Other exit procedures supplement the normal 
functions of the Sort. This is specified for each individual 
exit procedure below. 

The following exit points are provided* 

input_.fi I e 
output_f i I e 
compare 
input_record 
output_recoro 

All exit points may be active during tha same invocation of 
the Sort. 

The entry point names of all user supplied exit procedures 
are defined by the user. Specific names are shown below orly for 
convenience in discussion. 
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INPUT.FILE EXIT PROCEOURE 

An input_.fi le exit procedure replaces the standard input 
reading function of the Sort. The Sort calls the input_flle exit 
procedure only once during an execution of the Sort* 

An input_file exit procedure must perform the following 
functions For each record which is input by the user to the 
sorting process, the input_file exit procedure must make one call 
to the entry sort_$r e I ease (described later). After the 
input.fi le exit procedure has released the last Input record to 
the Sort* it must return to the Sort. 



Usage 

input_filei proc(code)? 

del cooe fixed bin(35) parameter? 

where code is a standard Multics status code <In err or_tab I e_) 
which must be returned by the input_file exit procedure. If the 
value Is not 0» then the Sort normally prints the corresponding 
message and returns to its cal ler with the status code 
fatal_error. {Output) 



(END) 



Page 50 



sort_ sort 



OUTPUT_FILE EXIT PROCEDURE 

An output_.fi le exit procedure replaces the standard output 
writing function of the Sort. The Sort calls the output_file 
exit procedure only once during an execution of the Sort. 

An output_file exit procedure must perform the following 
functions* For each record which is to be retrieved in ranked 
order from the Sort* the output_file exit procedure must make one 
call to the entry point sort_$return (described later). If 
sort_$return is cal led but there are no more recoras to be 
retrieved from the sorting process* then sort_$return returns 
with the status code end_o f_inf o • The output_.fi l e exit procedure 
then must return to the Sort. If the user desires* the 
output_file exit procedure may terminate retrieval at any time 
prior to receiving the end_of_info status* but it must still 
return to the Sort* (The entry sort_,$return may return status 
codes other than end_of_lnfo in case of error.) 



Usage 



output_.fi Jet proc(code)? 

del code fixed bin(35) parameter; 

where code is a standard Hultics status code (in error_tab le_) 

which must be returned by the output_file procedure. If the 

value is not fl* then the Sort normally prints the corresponding 

message and returns to its caller with the status code 
f atal„error • (Output) 
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COMPARE EXIT PROCEDURE 

A compare exit procedure replaces the standard record 
comparison procedure of the Sort. The Sort calls the compare 
exit procedure each time the sorting process is ready to rank two 
records; that is» to determine which of the two is first in the 
sorted order* 

A compare exit procedure must perform the following 
function! The compare exit procedure receives as arguments a 
pointer to each of the two records. The compare exit procedure 
must determine which of the two records is first - or that they 
are equal in rank - and must return a corresponding return value 
to the Sort. The compare exit procedure is invoked as a 
function. 



compare! proc (rec_ptr_i, rec_ptr_2) ret urns( fixed bind))? 



Usage 



del 



<rec_ptr_i 
rec_ptr_2 
resul t 



Ptr» 

ptr) parameter; 
fixed bind) 5 



del 



end 



return{resul t) ; 
compare » 



where * 



1. 



rec_p tr_l 



is a pointer to a douoie word aligned buffer 
containing the first record of the pair to be 
compared. This record is always the first of 
the two according to the original input 
order. (Input) 



2. 



rec_ptr_2 



is a pointer to a douDle word aligned buffer 
containig the second record of the pair to be 
compared. (Input) 



resul t 



is the result of the comparison. (Output) 



Possible values are* 



0 = the two records rank equal. 
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-i = the record pointed to by rec_ptr_i ranks 
first. 

♦i = the record pointed to by rec_ptr_2 ranks 
first. 

If a compare exit procedure requires the length of either 
record* it is available in the word preceding that record in the 
form! 

del rec_len fixed bin<2i) aligned; 

A compare exit procedure cannot alter either the content or 
the length of either record. 



(ENO) 
Page 53 



sort_ sort_ 



INPUT_RECORO EXIT PROCEQURE 

An Input_record exit procedure may be used whether the 
Sort's standard input_file procedure or a user supplied 
input_.file exit procedure is used* and supplements that 
Input_file process. The Sort calls the input_record exit 
procedure? 

1. Each time the input_file process releases a record to the 
Sort* and before that record is entered into the sorting 
process' 

2. Once more after the last input record has been released to 
the Sort (end of input); 

3. Additionally* each time the input_r ecord exit procedure 
returns with an action of insert. 

The Sort gives the input_record exit procedure access to the 
current record* the record about to be entered into the sorting 
process . 

An input_record exit procedure need not perform any 
processing. If it does not, then the Sort will accept the 
current record into the sorting process. 

An i npu t__record exit procedure may perform the following 
functions* which are accomplished via the values of arguments 
returned when the i nput_r ecord exit procedure returns to the 
Sort s 

Accept the current record. This is acc omp I ished by setting 
action = 0. 

Delete the current record. This is accomplished by setting 
action = i. 

Insert one or more records before the current record. (At 
the last call to the input_record exit procedure* records 
may be inserted at the end of input.) This is accomplished 
by setting rec_ptr to point to the record to be inserted* 
setting rec_len appropriately* and setting action = 3« 

Alter the current record* before it is entered into the 
sorting process. This is accomplished by altering the 
record pointed to by rec_ptr or setting rec_ptr to point to 
another record* setting rec_len appropriately* and setting 
action = Q . 
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Close the exit point so that the input_r ecord exit oroceoure 
wiil not be called again during this execution of the Sort. 
This is accomplished by setting cl ose_ex i t_sw = "1". 

The input_record exit procedure must return to the Sort each 
time it is cal I ed. 



Usage 



input_record* proc (rec_ptr * rec_Jen* action* c I ose_ex i t_sw ) 



del (rec_ptr 
rec_l en 
action 

c I ose_exi t_sw 



ptr, 

fixed bin(2i>* 
fixed bin, 

bit(i) ) parameter; 



where* 



i. rec_ptr 



2. rec_len 



points to a double word aligned buffer 
containing the current record. The 
input_record exit procedure may alter the 
contents of the record or may change the 
pointer to point to another record. For the 
actions of accept and insert* the Sort will 
use the value of rec^Qtr returned to it by 
the input_record exit procedure. 

(Input/Output) 

At the last cal I to the input_record exit 
procedure (end of inpjt)* there is no current 
record and rec_ptr = nu 1 1 ( ) • 

is the length of the current record in bytes. 
The input_record exit procedure may change 
the length of the record. For the actions of 
accept and insert* the Sort will use the 
value of rec_Jen returned to it by the 
input_record exit procedure. ( Input /Output ) 



3* action 



indicates the action to be taken upon 
to the Sort. (Input/Output) 



re turn 



Arguments referred to below are the values 
returned to the Sort by the input_record exit 
procedure. 
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Possible values of action are* 



0 = accept the current record. The record 

pointed to by rec_ptr, whose length is 
given by rec_len, is entered into the 
sorting process* 

Each time the input_record exit procedure 
is called* the Sort sets action to this 
value* 

1 = delete the current record. The current 

record is not entered into the sorting 
process. 

3 = insert a record. The record pointed to 
by rec_ptr, whose length is given by 
rec_len, is entered into the sorting 
process. The Sort calls the input. record 
exit procedure again, so that the current 
record may be accepted or deleted or an 
additional record may be inserted. At 
this next call to the input_record exit 
procedure, the current record remains the 
same. 



At the last call to the input_record exit 
procedure (end of input), it the input_record 
exit procedure inserts records then they are 
appended at the end of input. Any other 
value for action means do not append any 
records, and the input_record exit Mill not 
be taken again. 



^. cl ose_exi t_sw indicates whether the exit is to be closed 

hereafter. (Input/Output) 



Possible values are t 



"0" = keep this exit ODen. Each time the 
input_record procedure is called, the 
Sort sets c lose.exi t_sw to this value. 

"1" = close this exit. The Sort will not 
call the input_record exit procedure 
again during this execution of the Sort 
(even if the action is insert). 
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OUTPUT.RECORD EXIT PROCEDURE 

An output_record exit procedure may be used whether the 

Sort's standard output_flle procedure or a user supplied 

output_.fi le exit procedure is used* and supplements that 

output_file process. The Sort calls the output_recor d exit 
procedure! 

1. Each time it has determined the next record in ranked order 
from the merging process? 

2. Once more after the last record has been obtained from the 
merging process (end of output); 

3. Additionally, each time the output_r ecord exit procedure 
returns with an action of insert. 

The Sort gives the output_record exit procedure access to 
two recordsl 

1. The output record, about to be written to the output file. 
(If an output_.fi le exit procedure has been specified by the 
user, this is the record about to be returned to that exit 
procedure. ) 

2. The next record, the record leaving the merging process. 

An output_record exit procedure need not perform any 
processing. If it does not, then the output record is accepted 
for the output file. 

An output_record exit procedure may perform the fol lowing 
functions, which are accomplished via the values of arguments 
returned when the input_record exit procedure returns to the 
Sortt 

Accept the output record. This is accomplished by setting 
action = 0. 

Oelete the output record. This Is accomplished by setting 
action = 1. 

Oelete the record leaving the merge. This is accomplished 
by setting action = 2* 

Insert one or more records after the output record. (At the 
first call to the ou tput_record exit procedure, records may 
be inserted at the beginning of output. At the last call to 
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the output__record exit procedure* records may be Inserted at 
the end of output.) This is accomplished by setting 
rec_ptr__2 to point to the record to be inserted* setting 
rec_len_2 appropriately* and setting action - 3. 

Alter the output record* before it is written to the output 
file. This is accomplished by altering the record pointed 
to by rec_ptr_l or setting rec_ptr_i to point to another 
record, setting rec_len_i appropriately* and setting action 
= G to accept (or action = 3 to insert). 

Summarize data into the first record of a sequence of 
records with equal keys* and delete the succeeding records 
of the sequence. This may be accomplished as follows! At 
the first call to the out pu t_record exit procedure* set 
eoua 1 key checking on (equal_key_sw = "1"). At subsequent 
calls to the outpu t_record exit procedure* if the output 
record and the record leaving the merge have equal keys 
(equal_key =0)* then accumulate data into the output record 
and delete the record leaving the merge (action = 2)* If 
the two records have unequal keys (equal_key * 0)» then 
accept the output record (action = Q). 

Summarize data into the last record of a sequence with equal 
keys* and delete the preceding records of the sequence* 
This may be accomplished as follows* At the first call to 
the output_rec ord exit procedure* set equal key checking on. 
At subsequent calls* if the two records have equal keys then 
accumulate data into a work area and delete the output 
record (action =1). If the two records have unequal keys* 
then alter the output record using the accumulated data and 
accept that recora (action = Q). 

Sequence check the outout file. This is accomplished by 
setting seq_c heck_sw = "i**. If the output rcord will not 
collate properly with the output file* or does not have its 
keys in the position specified to the Sort, then set 
seq_check_sw = M 0"« 

Close the exit point so that the outout_record exit 
procedure will not be called again during this execution of 
the Sort. This is accomplished by setting c I ose_exi t_sw = 



The ou tput_record exit procedure must return to the Sort 
each time it is called. 
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Usage 



output_record* proc (rec_ptr,.i, rec_len_i, rec_ptr__2* rec_len. 

action* equal_key, equal_key_sw» 
seq_check_sw» c I ose_exi t_sw) ; 



del (rec_ptr_i 
rec_l en_i 
rec_ptr_2 
rec_l en_2 
act ion 
equal_key 
equal _key_sw 
seq_check_sw 



Ptr, 

fixed bin(2i>, 
ptr, 

fixed bin<2i>, 
fixed bin, 
fixed bind) , 
bit(l), 
bit<l>, 



c I ose_exi t_sw bit ID ) parameter; 



wheret 

i. rec_ptr_i 



points to a double word aligned buffer 
containing the output record. The 
output_record exit procedure may alter the 
contents of this record or may change the 
pointer to point to another record. The Sort 
uses the value of rec_ptr__i returned to it by 
the outpu t_ record exit procedure as specified 
below in the description of the action 
argument. (Input/Output) 



At the first call to the output_record exit 
procedure (beginning of output! 9 there is no 
output record and rec_ptr_i = nul I (). 

Z* rec_len_l is the length of the output record in bytes. 

The output_recor d exit procedure may change 
the length of this record. The Sort uses the 
value of rec^len^i returned to it by the 
output_record exit procedure as soecified 
below in the description of the action 
argument. (Input/Output) 

3. rec_ptr_2 points to a double word aligned buffer 

containing the record leaving the merge. The 
output_record exit procedure may not alter 
the contents of this record. For all actions 
except insert* the Sort will ignore the value 
of rec_ptr_2 returned to it by the 
output_record exit procedure. If the action 
is insert, then the output_record exit 
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procedure must change rec_ptr_2 to point to 
the record to be inserted. (Input/Output) 

At the last call to the output_record exit 
procedure (end of outpjt) * there is no record 
leaving the merge and rec_ptr_2 = nullO. 

is the length of the record leaving the 
merge. The ou tput_recor d exit procedure may 
not change the length of this record. For 
all actions except insert, the Sort will 
ignore the value of rec_len_2 returned to it 
by the output_record exit procedure. If the 
action is insert, then the output_record exit 
procedure must set rec_len_2 to the length of 
the record to be inserted. (Input/Output) 

indicates the action to be taken upon return 
to the Sort. (Input/Output! 

Possible values of action ares 

0 = accept the output record. The output 

record is written to the output file. 
The Sort uses the returned values of 
rec_ptr_i and rec.Jen_i to identify the 
record to be written. At the next call 
to the outpu t_record exit procedure, the 
record leaving the merge becomes the new 
output record, and a new record leaving 
the merge has been obtained. 

Each time the output_record exit 
procedure is called, the Sort sets action 
to this value. 

1 = delete the output record. No record is 

written to the output file. The Sort 
ignores the returned values of rec_ptr_l 
and rec_len_i. At the next call to the 
outpu t_record exit procedure, the record 
leaving the merge becomes the new output 
record, and a new record leaving the 
merge has been obtained. 

2 = delete the record leaving the merge. 

(This action should be used for 
summarization into the output record.) 
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No record is written to the output file. 
At the next call to the output.record 
exit procedure* the output record remains 
the same* and a new record leaving the 
merge has been obtained. The Sort uses 
the returned values of r ec_otr_i and 
rec_len_i to identify the output record 
for that next call to the outpu t_record 
exit procedure. 

3 = insert a record after the output record. 
The output record is written to the 
output file. The Sort uses the returned 
values of rec_ptr_i and rec_Jen_i to 
identify the record to be written. The 
Sort calls the output_record exit 
procedure again* so that the inserted 
record may be accepted or an additional 
record may be inserted. At this next 
call to the output_record exit procedure* 
the inserted record becomes the new 
output record* and the record leaving the 
merge remains the same. The Sort uses 
the returned values of rec_ptr_2 and 
rec_len_2 to identify the inserted 
record. 

At the last calJ to the output_record exit 
procedure (end of output )* if the 
output_record exit procedure inserts records 
then they are appended at the end of output. 
Any other value for action means do not 
append any records* and the outpu t_record 
exit will not be taken again. 

indicates whether the output record and the 
record leaving the merge have equal Keys. 
(Input) 

Possible values are* 
0 s the two records rank equal. 

ti = the two records do not rank equal. At 
the first and last calls to the 
output_r ecord exit procedure (beginning 
of input and end of input)* only one 
record Is present and the Sort sets 
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equal_key to this value. 

If the user supplied key descriptions* then 
the value of equa1_key is determined only by 
those key fields; the original input order 
of the two records is lit used to resolve key 
equality* If the user supplied a compare 
exit proceduret then the Sort uses the result 
of that compare exit procedure to set the 
value of equal_key. (In either case* if the 
two records rank equal then rec_ ptr_i points 
to the record which is first according to the 
original input order of the two records.) 

7. equal_key_sw indicates whether or not equal key checking 

is to be performed. (Input/Output) 

possible values are* 

"O" = do not check for equal keys* At the 
first call to the output_record exit 
procedure (beginning of output)* the 
Sort sets equal_key_sw to this value. 

"i" = check for equal keys before the next 
call to the output_record exit 
procedure. 

Since equal key checking takes time* the user 
should set equa l_key_sw = M l" only when 
required for actions such as summarization. 

3. seq_check_sw indicates whether or not sequence checking is 

to be performed. (Inpjt/Output) 

Possible values are» 

"0 W = do not sequence check. 

"1" = sequence check. At the first call to 
the ou tput_record exit procedure 
(beginning of output), the Sort sets 
seq_check_sw to this value. 

Sequence checking means comparing the output 
record to the record oreviously written to 
the output file. (If the user specified an 
output_file exit procedure, the output record 
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is compared to the record previously returned 
to that exit procedure.) Sequence checking 
is performed after the output_record exit 
procedure returns to the Sort* and only if a 
record is to be written to the output file 
(that is f only if the action is accept or 
insert). If the user supplied key 

descr iptions* then the Sort's key comparison 
routine is used. If the user supplied a 
compare exit procedure* then that exit 
procedure Is called. 

If the output record is out of sequence with 
the previous record* then the status , code 
fatal_error is returned to the cal ler of 
sort_; see the entry sort_ above. (If the 
user specified an outpat.fi le exit procedure* 
then the status code dat a_seq_err or is 
returned to that exit procedure* see the 
entry sort_$return below.) 

All records written to the output file* 
including inserted records* can be sequence 
checked. 

9. cl ose_exi t_sw indicates whether the exit is to be closed 

hereafter. (Input/Output) 

Possible values are* 

"fl" = keep this exit open. Each time the 
output_record exit procedure is called* 
the Sort sets c I ose_exi t_sw to this 
value. 

'V = close this exit. The Sort will not 
call the output .record exit procedure 
again during this execution of the Sort 
(even if the action is insert). 
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RECORD POINTERS 

Since the Sort aligns each record in a buffer on a double 
word boundary, if an exit procedure applies a based declaration 
of the record to the pointeris) then correct alignment is 
ensured. 



ORIGINAL INPUT ORDER (FIFO) 

For the compare and output_recor d exit procedures* rec_ptr_l 
always points to the record whose original input order was prior 
to the record pointed to by rec__ptr_2« If a compare exit 
procedure returns with an equal ranking for the two records, then 
this original input order is preserved. Original Input order has 
been defined earlier under the heading Key Fields. 



(END) 
Page 6<* 



sort_ sort_ 



Ent ry* sort_$ref ease , 

The entry iS sort_$re i ease" is used each time the caller 
releases a record to the sorting process* Calls to 
sor t_$rel ease are made from a user supplied input_file procedure. 
The caller specifies the location and length of the record. The 
Sort accepts the record and stores it in its own work area. 



del sort_$re! ease entryCptr* fixed bin(21)t fixed bin<35)); 
call sort_$rel ease (buff_ptr» rec_len f code); 
where* 

1. buff_ptr is a pointer to a byte aligned buffer 

containing the record. (Input) 

2. rec_len is the length of the record in bytes. 

(Input) 

3. code is a standard Mul tics status code returned by 

the Sort. Possible values are listed below 
under the heading Status Codes. (Output) 



The Sort aligns each record on a double word boundary in a 
work area. 



The following status codes may be returned by the 
sort_$rel ease entry point (all codes are in error_tabl e_) t 

0 Normal return (no error). 

out_of_sequence The call to sor t_$rel ease is not in the 

sequence required by the Sort; e.g., 
sort_ $re I ease has been called before sort_. 
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SOPt_ 



f ata l_error 



The Sort has encounterd a fatal error during 
the sorting process* The Sort Mill have 
previously generated a specific error message 
and signalled the sub_error_ condition via 
the sub_err__ subroutine. 



I ong_r ecord 



This input record is longer than the maximum 

supported. The record is ignored by the 

Sort* and the caller nay continue to release 
records to the Sort. 



shor t_record 



This input record is shorter than the minimum 
required to contain the key fields* The 
record is ignored by the Sort* and the caller 
may continue to release records to the Sort, 
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EotCY* sort_$return 

The sort_$re turn entry is used each time the caller 
retrieves a record* in ranked order* from tne Sort. Calls to 
sort_$return are made from a user supplied output_file procedure. 
Upon return from sort_$return, the caller is given the location 
and length of the record. 

If sort_$return Is called but there are no more records to 
be retrieved* then sort_$return returns to the caller with the 
status code end_of_info. 



del sort_$return entry(ptr, fixed bir>(21>* fixed bin(35))» 
call sort_$return (buff_ptr* rec_len» code); 
where* 

1* buff_.ptr is a pointer to a double word aligned buffer 

containing the record* (Output) 

2* rec_l en is the length of the record in bytes. 

(Output) 

3» code Is a standard Multics status code returned by 

the Sort. Possible values are listed below 
under the heading Status Codes. (Output) 



Males 

The Sort aligns each record on a double word boundary In a 
work area. Thus if the caller applies a based declaration of the 
record to the pointer then correct alignment is ensured. 



The following status codes may be returned by the 
sort_$return entry point (all codes are in er ror_tab I e_) : 

0 Normal return (not end of information* no 

error) • 
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sort. 



end.of .inf o 



There are no more records to be retrieved 

from the Sort. This is the normal end of 

data indication. No record is returned to 
the ca I ler. 



out. of .sequence 



f ata J.error 



The call to sort.Sreturn is not in the 
sequence required by the Sort? e.g.* 
sort_$return has been called before 
sor t.lrel ease* 

The Sort has encountered a fatal error during 
the sorting process* The Sort will have 
previously generated a specific error message 
and signalled the sub. error, condition via 
the sub_err. subroutine* 



dat a. J oss 



End of data has been reached* but the number 
of records previously returned is less than 
the number of records released to the Sort. 
No record is returned to the caller. 



dat a.gain 



The number of records returned (including 
this record) is now larger than the number of 
records released to the Sort. The current 
record is returned to the caller* and the 
caller may continue to retrieve records from 
the Sort* 



dat a.seq.error 



Njame * merge. 



A ranking error has occurred in the records 
returned to the caller (as determined by the 
Key fields of the record)* The current 
record is returned to the caller* and the 
caller may continue to request records from 
the Sort* 



The merge, subroutine provides a generalized file merging 
capability* which is specialized for execution by user supplied 
parameters. The basic function of merge, is to read one or more 
input files of records which are in order according to the values 
of one or more Key fields* merge (collate) those records 
according to the values of those Key fields* and write a single 
output file of ordered (or •"ranked*") records* The merge, 
subroutine has the following general capabilities* 

Input and output files may be on any storage medium and in 
any file organization; 
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Very large files, such as mu It isegmen t files* can be merged? 

Multiple key fields and most PL/I string and numeric data 
types may be specified; 

Exits to user supplied subroutines are permitted at several 
points during tne merging process. 

The arguments to the merge_ subroutine include one or more 
pointers to additional information necessary to specialize merge., 
for execution. This additional information is called the Merge 
Oescript i on. 



INPUT AND OUTPUT 

The user specifies the Input and output files. The Merge 
reads the input files and writes the output file. Each input or 
output file may be stored on any medium and in any file 
organization supported by an I/O module through iox_. The I/O 
module may be one of the Multics system I/O modules (such as 
tape_ansi_) , or one supplied by a specific installation, or one 
written by a user* An input or output file is specified either 
by a pathname or by an attach description. 

In all cases, records may be either fixed length or variable 
I ength. 



KEY FIELOS 

The user can specify the key fields to be used in ranking 
records. Key fields are described in the Keys statement - or in 
the keys structure - of the Merge Description. Up to 20 key 
fields may be specified. Any PL/I string or numeric data type 
except complex or pictured - may be specified for a given Key 
field. Ranking may be ascending, descending, or mixed. For a 
character string key field, the collating sequence is that of the 
Multics standard character set. The records of each input file 
must be in order according to those key fields. 

Alternatively, the user can supply a compare procedure, 
which is then used to rank records. The ^ecords of each input 
fife must be in order according to the algorithm of that 
procedure. 

The original input order of records with equal keys is 
preserved (FIFO order). Original input order is defined as 
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f O I I OWS! 

1. If two equal records come from different input files* then 
the record from the file which is specified earlier in the 
list of input files (in the input_ specs subroutine argument) 
is first. 

2. If two equal records come from the same input filet then the 
record which is earlier in the file is first. 



EXITS 

The Merge provides exits to user supplied procedures at 
specific points during the merging process. Exit procedures are 
named in the Exits statement - or in the exits structure - of the 
Merge Description. The following exit points are provided! 



output__record 



To perform special processing for each output 
record, such as deleting* inserting* or 
altering records to be output from the Merge? 
or summarizing data by accumulating it into a 
summary record. 



compare 



To compare two records? that is» to rank them 
for the merging process. 



Details of exit procedures are given below under the heading 
Writing Exit Procedures. 
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del merge_ entry ((*) char (*) « char(*>* (*)ptr. 

char(+>. fixed bin(35n; 

( inpu t_specs* output_spec» merge_desc* 
user__out_sw* code); 



is an array containing the specifications of 
the input files* Up to 10 input files may be 
specified* The array extent specifies the 
number of input files* (Input) 

Input file 1 is specified in the array 
element input_specs< 1 ) * in one of the 
f ol I owing forms ! 

-input_file pathname 

-if pathname If an input file is in the Multics 

storage system and its file organization 
is either sequential or indexed* then it 
may be specified by its pathname. The 
file may be either a single segment or a 
multisegment file* The star convention 
can not be used* 

An input file specified by a pathname 
Mill be attached using the attach 
description "vfile_ pathname"* 

-input^descript ion attach_desc 

-ids attach_desc If an input file is not in the Multics 

storage system or its file organization 
is neither sequential nor indexed* then 
it must be specified by an attach 
description. The target I/o module 
specified via the attach description 
must support the sequent i a l_Input 
opening mode and the iox_ entry point 
read_record* 

Pathnames and attach descriptions can be 
intermixed in the input.specs array. 

2* output_spec is the specification of the output file* 

Only one output file may be specified* 



call merge. 
where* 

i* input_specs 
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(Input) 

The output file may be specified 
the following forms! 



in one of 



output_f i I e 
•of pathname 



pathname 



If the output file Is in the Mul tics 
storage system and its file organization 
is sequential, then it may be specified 
by Its pathname. The file may be either 
a single segment or a multisegment file. 

The equals convention can be used* If 
it is, it is applied to the pathname of 
the first input file and the first Input 
file must be specified by a pathname* 
not by an attach description* 

An output file specified by a pathname 
will be attached using the attach 
description "vfile. pathname"* Thus if 
the file does not exist* it will be 
created. If It does exist, it will be 
overwritten. 



-outpu t_descr ip tion attach_desc 



-oas attach_desc 



If the output file is not in the Mul tics 
storage system or its file organization 
is not sequential, then it must be 
specified by an attach description. The 
target I/O module specified via the 
attach description must support the 
sequent i al_output opening mode and the 
iox_ entry point wr I te_record. 



mer ge__desc 



is an array of pointers to the Merge 
Jescription. See the heading Merge 

Description below* (Input) 



user_out_sw 



specifies the destination of both the summary 
report and diagnostic messages for errors 
detected in the arguments to merge_ or in the 
Merge Description. (Input) 

This argument may have the following values! 
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= write the summary reoort and 
diagnostic messages via the I/O 
switch user.output. 

M -bf M = do not write the summary report 

and diagnostic messages. If any 
errors are diagnosed, merge, 
will return with the status code 
bad.arg out information about 
the numoer and nature of the 
errors Is not available. 

switchname = write the summary report and 
diagnostic messages via the I/O 
switch named switchname. The 
switch must be attachea and open 
for stream output. 

5. code is a standard Multics status code returned by 

merge.. Possible values are listed below 
under the heading Status Codes. (Output) 



NOTES 

Any pathname may be relative (to the user's current working 
directory) or absolute. 



STATUS COOES 

The following status codes may be returned by merge, (all 
codes are in error. tab I e.) t 

Q Normal return (no errors!. 

bad.arg One or more arguments specified to merge.* 

including those in the Merge Description, was 
invalid or inconsistent. The Merge will have 
previously written diagnostic messages as 
directed by the user.out.sw argument. The 
merging process itself has not been started. 

fatal. error The Merge has encountered a fatal error 

during the merging process. The Merge will 
have previously generated a specific error 
message and signalled the sub.error. 
condition via the sub_err_ subroutine. 
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ou t_o f ^sequence 



The call to merse_ is not In the sequence 
required by the Merge? that is* merge_ has 
been called after initiation of the Merce but 
before termination of that invocation. 
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jjecflg Qescr jot ion. 

The Merge Description contains additional information to 
specialize the Merge for a particular execution. The Merge 
Description is specified via the merge_desc argument to merge_. 
The information specified may be? 

Keys - Description of one or more Key fields used for 
ranking records. 

Exits - Specification of which exit points are to be used 
and the names of the corresponding user supplied 
exit procedures. 

A Merge Description is required. As a minimum, the user 
must specify how records are to be ranked* either by describing 
key fields in the Keys statement or by naming a compare exit 
procdure in the Exits statement. Other information in the Merge 
Description is optional. 

The Merge Description may be suppl ied to merge., in either of 
two forms* called source form and internal form. 

The source form of the Merge Description is written exactly 
as specified for the merge command (see the Multics Programmers - 
Manual* Commands and Active Functions* Section III)* and is 
stored as an ASCII segment; that is* as an unstructured file in 
the Multics storage system. If source form Is used* then the 
merge_desc argument to merge., must have an array extent of i and 
the one pointer mjst be a pointer to the segment. (The segment 
must contain only the Merge Description.) The source form is 
useful when the user writes the Merge Description and supplies it 
to the procedure which calls merge... 

The internal form of the Merge Description is a set of one 
or two structures. The merge_desc argument must have an array 
extent of 2» and the two pointers are pointers to the two 
structures. Any of the structures can be omitted; In that case 
the corresponding pointer must be nul I. The pointers must be 
specified in the array in the following order * 

addr (keys) 
addr (exi ts) 

where the two structures (keys and exits) a~e defined below. The 
internal form is useful when the procedure celling merge_ 
constructs the Merge Description. 
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KEYS STRUCTURE 

The keys structure is used when the caller describes Key 
fields. The Mergers standard compare routine will then be used 
to rank records* If the caller describes keys* then the compare 
exit must not be specified. 

If the caller does not describe keys* then the corresponding 
pointer in the array merge_desc must be nul I and the compare exit 
must be specified in the exits structure. The user supplied 
compare routine will then be used to rank records. 

The keys structure is* 

del i keys* 

2 version fixed bin init(l)* 

2 number fixed bin* 

2 key_desc(user_keys_ number ref er (keys. number! ) * 

3 datatype char(8)* 

3 size fixed bin<2<*>* 

3 word_offset fixed bin(l8)* 

3 bit_offset fixed bin<6>* 

3 desc char<3)? 



where * 



1. version 



is the version number of the structure 
be D . 



( must 



2. number 



is the number of key fields* established 
the value of user_keys_number • 



by 



3. key__desc 



is an array of key descriptions. Each key 
description is one element of the array. The 
key descriptions must be specified in order* 
the major key first and the most minor key 
1 ast • 



*»• datatype 



is the data type of the key field. See the 
table below for the encoding of datatype. 
The value must be left Justified within 
datatype. 



5. size 



is the size of the key field* In units which 
depend on the data type. 

For string data types* size is the exact 
length (characters or bits) of the field. 
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Fpr arithmetic data types* size is the 
precision (binary or decimal digits) of the 
field* The space occupied is determined by 
precision in combination with the data type. 
The space occupied is not adjusted for an 
aligned field. For example* for an aligned 
fixed binary field of one word* size must be 
specified as 35» for an aligned floating 
binary field of two words* size must be 
specified as 63. See the table below for the 
semantics of size. 

is the word portion of the offset of the 
beginning of the Key field, relative to the 
beginning of the record. Consider the record 
as being aligned on a word boundary, as will 
oe the case for a Hultics PL/I structure, 
tfords are numbered from 0 for the first word 
of the record. 

is the bit portion of the offset of the Key 
field; that is* the bit offset within the 
word in which the Key field begins. Sits are 
numbered from 0 to 35. (If the field is 
aligned on a word boundary* then bit_ offset 
is Q.) 

indicates whether ranking for this key field 
is to be ascending or descending. Possible 
values are* 

= use ascending ranking. 

**dsc" = use descending ranking. 



(ENO) 
Page 77 



merge_ merge.. 
DATATYPE ENCOOING ANO SEMANTICS OF SIZE 



Encoding I Semantics of size 

of I (where size = n) 

datatype I Unit Range Space Occupied 



Character string 
(Multics ASCII) 


char 


9 bit 

c hap art pp 

v My' t J w 1 w I 


1 


- <t095 


n 


char ac ters 


Bit string 


bit 


1 bit 


i 


- ^095 


n 


bits 


Fixed binary 


bin 


i bit 


i 


- 71 


n 


♦ i bits 


Floating binary 


f Jbin 


i bit 


i 


- 63 


n 


+ 9 bits 


Fixed decimal 
( 1 eading sign) 


dec 


9 bit 
digit 


1 


- 59 


n 


♦ 1 digits 


Floating decimal 


f 1 oec 


9 bit 
digit 


i 


- 59 


n 


+ 2 digits 
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EXITS STRUCTURE 

The exits structure isi 



del i exits* 

1 version 

2 compare 
2 reserved 

2 output_record 



fixed bin init(i), 
entry* 

entry init (mer ge_$noexl t ) * 
entry; 



where* 

i* version 

2* compare 



is the version number of the structure (must 
be 1) . 

specifies the entry point of a user supplied 
compare exit procedure. If the caller 
describes key fields (supplies a keys 
structure)* then this exit must not be 
spec! f led* 



3. reserved 

outpu t_record 



is reserved for future use* 

specifies the entry point of a user supplied 
output_record exit procedure* 



ENTRY VARIABLES 

In the exits structure* each exit point is specified via an 
entry variable* The entry variable must be set (either 
initialized or assigned) by a user procedure* normal ly the 
procedure which calls merge_* The entry variable can identify 
either an internal entry point (that is. an internal procedure) 
or an external entry point of the procedure which sets the entry 
variable? or it can identify an external entry point of another 
user procedure* 

If none of the exits declared in the exits structure is to 
be used* then that structure can be omitted and the 
corresponding pointer in the array merge_desc must be null. If 
the structure is included but an exit specified in it is not to 
be used* then the corresponding entry variable must be set to 
merge_$noexi t* which is declared! 

del raerge_$noexl t entry external? 
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An exit point may not be altered after the call to merge_« 
Any change to the entry variable thereafter Mill have no effect. 
However* certain entry points can be disabled* as specified in 
the aescriptions of the individual exit procedures below* 
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HrJLlioa LxAl Pc&££dycA£ 

A user supplied exit procedure is called by the Merge to 
perform a specified function. The user procedure must perform 
that function, and then must return to the Merge* The user exit 
procedure may perform additional functions desired by the user. 

Certain exit procedures replace the corresponding standard 
routine of the Merge. Other exit procedures supplement the 
normal functions of the Merge. This is specified for each 
individual exit procedure below. 

The following exit points are provided* 

output_record 
compare 

AM exit points may be active during the same invocation of 
the Merge. 

The entry point names of all user supplied exit procedures 
are defined by the user. Specific names are shown below only for 
convenience in discussion. 
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COMPARE EXIT PROCEDURE 



A compare exit procedure replaces the standard record 
comparison procedure of the Merge* The Marge calls the compare 
exit procedure each time the merging process is ready to rank two 
records; that is f to determine which of the two is first in the 
merged order. 

A compare exit procedure must perform the following 
function! The compare exit procedure receives as arguments a 
pointer to each of the two records. The compare exit procedure 
must determine which of the two records is first - or that they 
are equal in ranK - and must return a corresponding return value 
to the Merge. The compare exit procedure is invoked as a 
f unct ion. 



Usage 

comparei proc (r ec_p tr_i» rec_ptr_2) re turns ( f ixed bind)); 

del (rec_ptr_i ptr t 

rec_ptr_2 ptr) parameter? 

del result fixed bind)? 



. . • 



return (resul t ) ; 
end compare? 



where J 

i* rec_ptr_i 



2- rec_ptr__2 



3* result 



is a pointer to a double word aligned buffer 
containing the first record of the pair to be 
compared. This record is always the first of 
the two according to the original input 
order. (Input) 

is a pointer to a double word aligned buffer 
containig the second record of the pair to be 
compared. (Input) 

is the result of the comparison. (Output) 
Possible values ares 



0 = the two records rank equal. 
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-1 = the record pointed to by rec_ptr_i ranks 
first. 

+1 = the record pointed to by rec_ptr__2 ranks 
first. 

If a compare exit procedure requires the length of either 
recordt it is available in the word preceding that record in the 
f ormi 

del rec_Jen fixed bin(Hi) aligned? 

A compare exit procedure cannot alter either the content or 
the length of either record. 
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OUTPUT_RECORD EXIT PROCEOURE 

An output_record exit procedure supplements the standard 
output writing function of the Merge. The Merge calls the 
ou tput_record exit procedure! 

1. Each time it has determined the next record in ranked order 
from the merging process; 

2. Once more after the last record has been obtained from the 
merging process (end of output)? 

3. Additionally* each time the output_r ecord exit procedure 
returns with an action of insert. 



The Merge gives the output_r ecord exit procedure access to 
two records? 



1. The output record* about to be written to the output file. 

2. The next record* the record leaving the merging process. 



An ou tput_record exit procedure need not perform any 
processing. If it does not* then the output record is accepted 
for the output file. 



An outpu t_record exit procedure may perform the following 
functions* which are accomplished via the values of arguments 
returned when the out put_r ecord exit procedure returns to the 
Mer gel 

Accept the output record. This is accomplished by setting 
action - 0. 



Delete the output record. This is accomplished by setting 
action = i. 

Oelete the record leaving the merge. This is acco rcp I i shed 
by setting action = 2. 

Insert one or more records after the output record. (At the 
first call to the out put_record exit procedure* records may 
be inserted at the beginning of output. At the last call to 
the output^record exit procedure* records may be inserted at 
the end of output.) This is accomplished by setting 
rec_ptr_2 to point to the record to be inserted* setting 
rec_len_2 aPPr oorlate ly » and setting action = 3. 
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Alter the output record, before it is written to the output 
file. This is accomplished by altering the record pointed 
to by recj)tr,i or setting rec_ptr_i to point to another 
record, setting rec_len_l appropriately, and setting action 
= 0 to accept (or action = 3 to insert). 

Summarize data into the first record of a sequence of 
records with equal Keys, and delete the succeeding records 
of the sequence. This may be accomplished as follows! At 
the first call to the output_record exit procedure, set 
equal Key checking on (equal_key__sw = **i"). At subsequent 
calls to the output_record exit procedure, if the output 
record and the record leaving the merge have equal Keys 
(equal_key = 0>f then accumulate data into the output record 
and delete the record leavig the merge (action =2). If the 
two records have unequal Keys (equal_key * 0), then accept 
the output record (action =0). 

Summarize data into the last record of a sequence with equal 
Keys, and delete the preceding records of the sequence. 
This may be accomplished as follows! At the first call to 
the output_record exit procedure, set equal key checking on. 
At subsequent calls, if the two records have equal keys then 
accumulate data into a work area and delete the output 
record (action =1). If the two records have unequal Keys, 
then alter the output record using the accumulated data and 
accept that record (action = 0). 

Sequence check the output file. This is accomplished by 
setting seq_check_sw = If the output record will not 

collate properly with the output file, or does not have its 
keys in the position specified to the Merge, then set 
seq_check_sw = "Q M . 

Close the exit point so that the outpu t_record exit 
procedure will not be called again during this execution of 
the Merge. This is accomplished by setting c !ose_exit_sw = 

** a am 
1 • 

The output_record exit procedure must return to the Merge 
each time it is cal led. 



Usage 

output_recordi proc(rec_ptr_i, rec_Jen_i, rec_ptr_2» rec_len_2* 

action, equal_Key, equa l_key_sw, 
seq_check_sw, cl ose.exi t_sw) ; 
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del <rec_ptr_i 
r ec_l en_l 
r ec_ptr_2 
rec_J en_2 
act ion 
equal _key 
equa I _key_sw 
seq_check_3w 



Ptr, 

fixed bin<21), 
ptr, 

fixed bin(21»t 
f ixed bin, 
fixed bin(i) , 
bit(i), 
bit (i), 



c I ose_exi t_sw bit(i) ) parameter? 



where* 



i. rec_ptr_i ooints tp a double word aligned buffer 

containing the output record. The 
outpu t_record exit procedure may alter the 
contents of this record or may change the 
pointer to point to another record. The 
Merge uses the value of rec_ptr_l returned to 
it by the ou tput_recor d exit procedure as 
specified below in the description of the 
action argument. (Input/Output) 



At the first call to the output_record exit 
procedure (beginning of output)* there is no 
output record and rec_ptr_i = nut II). 

2. rec_len_i is the length of the output record in bytes. 

The output_recor d exit procedure may change 
the length of this record. The merge uses 
the value of rec_len_i returned to it by the 
outpu t_record exit procedure as specified 
below in the description of the action 
argument. (Input/Output) 

3. rec_ptr_2 points to a double word aligned buffer 

containing the record leaving the merge. The 
out pu t_record exit procedure may not alter 
the contents of this record. For all actions 
except insert* the Merge win ignore the 
value of rec_ptr_2 returned to it by the 
outpu t_record exit procedure. If the action 
is insert* then the output_record exit 
procedure must change rec_ptr_2 to point to 
the record to be inserted. (Input/Output) 

At the last call to the output_recor d exit 
procedure (end of output), there is no record 
leaving the merge and rec_ptr_2 = nu!1<). 
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**• rec_len_2 is the length of the record leaving the 

merge* The ou tput_record exit procedure may 
not change the length of this record. For 
alt actions except insert. the Merge will 
ignore the value of rec_len_2 returned to it 
by the output__record exit procedure. If the 
action is insert* then the output_record exit 
orocedure must set rec^len_2 to the lencth of 
the record to be inserted. (Input/Output) 

5. action indicates the action to be taken upon return 

to the Merge. (Input/Output) 

Possible values of action are! 

0 = accept the output record. , The output 

record is written to the output file. 
The Merge uses the returned values of 
rec_ptr_i and rec_len_i to identify the 
record to be written. At the next call 
to the outpu t_record exit procedure, the 
record leaving the merge becomes the new 
output record* and a new record leaving 
the merge has been obtained. 

Each time the outpu t_record exit 
procedure is called, the Merge sets 
action to this value. 

1 = delete the output record. No record is 

written to the output file. The Merge 
ignores the returned values of rec_ptr_i 
and rec_len_i. At the next call to the 
output_record exit procedure* the record 
leaving the merge becomes the new output 
record, and a new record leaving the 
merge has been obtained. 

2 = delete the record leaving the rrerge. 

(This action should be used for 
summarization into the output record.) 
No record is written to the output file. 
At the next call to the output_record 
exit procedure* the output record remains 
the same* and a new record leaving the 
merge has been obtained. The Merge uses 
the returned values of rec_ptr_l and 
rec_len_i to identify the output record 
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for that next cal I to the output_record 
exit procedure. 

3 = Insert a record after the output record* 
The output record Is written to the 
output file. The Merge uses the returned 
values of rec_otr_l and rec_J en_l to 
Identify the record to be written. The 
Merge calls the output_record exit 
procedure again* so that the inserted 
record may be accepted or an additional 
record may be Inserted. At this next 
call to the output_record exit procedure* 
the inserted record becomes the new 
output record* and the record leaving the 
merge remains the same. The Merge uses 
the returned values of rec_pt_2 and 
rec_Jen_2 to Identify the inserted 
record. 

At the last call to the output_record exit 
procedure (end of output)* it the 
output_record exit procedure inserts records 
then they are appended at the end of output. 
Any other value for action means do not 
append any records* and the output_ record 
exit will not be taken again. 

indicates whether the output record and the 
record leaving the merge have equal keys. 
(Input) 

Possible values ares 

0 = the two records rank equal. 

+1 = the two records do not rank equal. At 
the first and last calls to the 
output_recor d exit procedure* (beginning 
of output and end of output)* only one 
record is present and the Merge sets 
equal_key to this value. 

If the user supplied key descriptions* then 
the value of equal_key is determined only by 
those key fields? the original Input order 
of the two records is ao.1 used to resolve key 
equality. If the user supplied a compare 



(END) 
Page 88 



mer ge_ 



roerge_ 



exit procedure, then the Merge uses the 
result of that compare exit procedure to set 

the value of equal_key. (In either case, If 

the two records rank equal then rec_ptr_i 

points to the record wnich is first according 

to the original inpjt order of the two 
records. ) 



7m equa l_.key_.sw 



Indicates whether or not equal key 
is to be performed. (Input/Output) 



checking 



Possible values ares 



"0" = do not check for equal keys. At the 
first call to the output_record exit 
procedure (beginning of output) , the 
Merge sets equal _key__sw to this value. 

*'i M = check for equal keys before the next 
call to the output_record exit 
procedure. 

Since equal key checking takes time, the user 
should set equa l_key_sw = "l" only when 
required for actions such as summarization. 

8. seq_check_sw indicates whether or not sequence checking is 

to be performed. (Input/Output) 

Possible values are? 

•*0** - do not sequence check. 

s- sequence check. At the first call to 
the output _record exit procedure 
(beginning of output), the Merge sets 
seq_check_sw to this value. 

> 

Sequence checking means comparing the output 
record to the record previously written to 
the output file. Sequence checkirg is 
performed after the output_record exit 
procedure returns to the Merge, and only if a 
record is to be written to the output file 
(that is, only if the action is accept or 
insert). If the user supplied key 

descriptions, then the Merge's key comparison 
routine is used. If the user supoliea a 
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compare exit procedure* then that exit 
procedure is called* 

If the output record is out of sequence with 
the previous record* then the status code 
fatal_error is returned to the caller of 
merge_; see the entry raerge_ above* 

All records written to the output file* 
including inserted records* can be sequence 
checked. 

9. cl ose_exi t_sw indicates whether the exit is to be closed 

hereafter. ( Input/Outp jt ) 

Possible values are! 

"0" = Keep this exit open. Each time the 
output_record exit procedure is called* 
the Merge sets c I ose_exi t_sw to this 
value. 

= close this exit. The Merge will not 
call the output_r ecord exit proceoure 
again during this execution of the 
Merge (even if the action is insert). 
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RECORD POINTERS 

Since the Merge aligns each record in a buffer on a double 
word boundary, i f an exit procedure applies a based declaration 
of the record to the pointer(s) then correct alignment is 
ensured. 



ORIGINAL INPUT OROER (FIFO) 

For the compare and output_record exit procedures* rec_ptr_i 
always points to the record whose original input order was prior 
to the record pointed to by rec_ptr_2« If a compare exit 
procedure returns with an equal ranking for the two records* then 
this original input order is preserved. Original input order has 
been defined earlier under the heading Key Fields. 
Hajari sort 

The sort command is described in the Multics Programmers' 
Manual* Commands and Active Functions* Section III. This 
description includes only additional optional control arguments 
which are not described in MPM Commands. 



sort input_.specs output_specs control _args 
where* 

3. control__args can be chosen from the following (in addition 

to those control arguments specified in MPM 
Commands) * 

-time prints timing information for the Sort* 

System load thmu> 
Merge order 
String size 
and for each phase of the Sorti 
Elapsed time 
Real cpu time 
Virtual cpu time 
Page f aul ts 
Paging device faults 
Comparisons executed 

(Times are given In seconds.) 
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specifies that the merge order is to be 
ia« The argument & must be a decimal 
integer. This argument is meaningful 
only if all input files are in the 
Storage System, so that the total input 
file size can be obtained by the Sort. 

specifies that the string size (as 
produced during the presort) is to be s 
bytes. The argument §. must be a decimal 
integer, and must be less than the 
system maximum segment size. The actual 
size of any string may differ somewhat 
from 5., since the length of the last 
record inserted into the string may not 
exactly match the soace available. 

Merge order and string size cannot both be 
spec i f ied. 

debug specifies that temporary files will be left 

initiated (but truncated to zero length) 
after completion of the Sort. This argument 
is intended for use with performance 
measurement and analysis tools which print 
reference names, such as sampl e_ref s. 

If this argument is omitted, temporary files 
will be deleted after completion of the Sort. 

If -debug is specified, deletion of temporary 
files must be done explicitly by the user. 
Some temporary files are in the process 
directory; the work files are in the 
directory specified by the -temp_dir 
argument. The names of all temporary files 
are generated uniquely for each invocation of 
the Sort, and always contain the string 
"sort_". 



-mer ge__order jfl 



-string_slze s. 
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UajSfi* merge 

The merge command is described in the Multics Programmers' 
Manual f Commands and Active Functions. Section III. This 
description includes only additioral optional control arguments 
which are not described in MPM Commands. 



merge input_specs output_specs coit ro l__args 
where! 

3* control_args can be chosen from the following (in addition 

to those control arguments specified in MPM 
Commands) t 

-time prints timing information for the Merge* 

System load (hmu) 
and for each phase of the Merges 
Elapsed time 
Real cpu time 
Virtual cpu time 
Page faul ts 
Paging device faults 
Comparisons executed 

(Times are given in seconds*) 

-debug specifies that temporary files will be 

left initiated (but truncated to zero 
length) after completion of the Merge. 
This argument is intended for use with 
performance measurement and analysis 
tools which print reference names* such 
as sample_refs. 

If this argument is omitted* temporary 
files will be deleted after completion 
of the Merge* 

If -debug is specified* deletion of 
temporary files must be done explicitly 
by the user* AM temporary files are in 
the process directory. The names of all 
temporary files are generated uniquely 
for each invocation of the Merge* and 
always contain the string "sort.."* 
(ENO) 
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MflLJlfi* sort.. 

The sort_ subroutine is descrioed in the Mul tics 
Programmers* Manual* Subroutines* Section II. This description 
includes only additional entry points which are not described in 
MPM Subroutines. 



£ntrv? sor t_$ini ti ate 

The sor t_$ini tiate entry point is used when the Sort Is 
"driven" by its caller. The Sort is said to be driven if the 
caller supplies a procedure which calls (or directly performs) 
the input file processing and outout file processing procedures. 
Such a driver must have the following general form! 



call sort_$lnitiate(arguments> ? 

call input_f i I e_proc (code ) ? 

call sor t_$commence (code) * 

call output_f i J e_proc<cod€) ? 

call sort_Bterminate(code) ; 



where! 



1. 



sort_$ in i tlate 



is the procedure of the Sort which must 
be called first (it "initiates" the 
Sort) . 



2. 



Input_f I I e_proc 



is an Input_file procedure, as specified 
in the description of the sort_ 
subroutine in MPM Subroutines. Instead 
of calling an inojt_file procedure* the 
driver may perform the necessary 
functions directly. 



sort_$commence 



is the procedure of the Sort which must 
be called when the Inputs file procedure 
has completed releasing records to the 
sorting process (it "commences" the 
merging process). See the entry 
sort__$commence below. 



ou tou t_f i I e_proc 



is an output_flle procedure* as 
specified in the description of the 
sort_ subroutine in MPM Subroutines. 



Page 9** 



Sort/Merge PLM 



Sort/Merge PLM 



Instead of calling an output_file 
procedure, the driver may perform the 
necessary functions directly. 

is the procedure of the Sort which must 
be called when the output_.fi le procedure 
has completed retrieving records from 
the Sort (it "terminates" the sorting 
process). See the entry sor t_$termi nate 
be I ow. 

The entry points sort_$ini ti ate* sort„$commence» and 
sor t_$terminate are specifically designed to be used by COBOL 
object programs* They support the ANSI COBOL Sort/Merge Module, 
Level 2 C the SORT, RELEASE, and RETURN statements). 

Normally, when called as a command (sort) or as a subroutine 
(sort.), the Sort itself contains the driver to perform the five 
calls 1 is ted above. 



del sor t_$ini t iate entry (char (*) , ptr, ptr, 

char(*), float bin(27), fixed bin(35)) 

call sort_$ini tiate< temp_dir, keys_pt r , exi ts_ptr, 

user_out_sw, file^size, code); 

where* 

1. temp_dir is the pathname of the directory which will 

contain the Sort's work files. If this 
argument is then work files will be 

contained in the user's process directory. 

This argument should be used when the process 
directory will rot be large enough to contain 
the work files. The get_dir_ functior may be 
used to obtain the name of the user's current 
working directory. (Input) 

2« keys_ptr is a pointer to the keys structure, which 

describes the key fields to be used for 
ranking records. This structure is identical 
to that specified under the heaaing Keys 
Structure in the description of the sort_ 
subroutine in MPM Subroutines, Section II. 
If the user is supplying a compare exit 



5. sort_$ terminate 
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procedure, then keys = ptr must be null and the 
compare procedure must be specified in the 
exits structure. (Input) 



3. exits_ptr 



is a pointer to the exits structure* which 
specifies which exit points are to be used 
and gives the entry point names of the 
corresponding user supplied exit procedures. 
This structure is identical to that specified 
under the heading Exits Structure in the 
description of the sort, subroutine in MPM 
Subroutines, Section II. If no exits are to 
be used, then exits_ptr must be null. If the 
compare exit is specified, then keys must not 
be described. (Input) 



user_out_sw 



specifies the destination of both the Sort's 
summary report and diagnostic messages for 
errors detected In the arguments to 
sort_$initiate. (Input) 

This argument may have the following values* 

= write the summary report and 
diagnostic messages via the I/O 
switch user_output. 



'-bf 



= do not write the summary report 
and diagnostic messages. If any 
errors are diagnosed, 

sort_$ ini t ia te will return with 
the status code bad_arg but 
information about the number and 
nature of the errors is not 
avai lable. 



switchname = write the summary report and 
diagnostic messages via the I/O 
switch named switchname. This 
switch must be attached and open 
for stream output. 



5. fiie_size 



is the total amount of data to be sorted, in 
millions of bytes. If this argument is zero, 
the default assumption is approximately one 
million bytes (file_size - 1.0). (Input) 



The file_size argument Is used for 
optimization of performance; the actual 
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amount of data can be considerably larger 
without preventing the Sort from completing. 
The maximum amount of data which can be 
sorted is (in bytes) approximately 60 million 
times the square root of file_size. 



6. code is a standard Multics status code returned by 

sort_$ini ti ate. Possible values are listed 
below under the heading Status Codes. 
(Output) 



Entry Variables 

Entry variables in the exits structure should be set 
(either initialized or assigned) by the procedure which calls the 
sor t_initiate entry point. 



In order that the Sort can be terminated properly in case of 
an abnormal exit* the cleanup procedure of the caller of 
sor t_$ini tiate must include a call to the entry point 
sor t_$ terminate. 



The following status codes may be returned by sort_$ini t iate 
(all codes are in error_tab I e_) I 

0 Normal return (no errors). 

bad_arg One or more arguments specified to 

sort_$ini tiate* including the keys and exits 
structures* was invalid or inconsistent. The 
Sort will have previously written diagnostic 
messages as directed by the user_out_sw 
argument. The sorting process itself has not 
been started. 



fatal_error The Sort has encountered a fatal error. The 

Sort will have previously generated a 
specific error message and signalled the 
sub_error_ condition via the sub_err_ 
subroutine* 
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out_of_sequence The call to sort_$init iate is not in the 

sequence required by the Sort? e.g.* 
sort_$ini tiate has been called after 
initiation of the Sort but before normal 
termination of that invocation via a call to 
sort_$ terminate • 
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EClttY.* sort_$commence 

The sort_$commence entry point roust be called after the 
driver of the Sort has completed Its input_file procedure. See 
the entry point sort_$inl t late above* The call to sort_$commence 
informs the Sort that end of input has been reached. Upon return 
from sort_$com»ence» the driver can begin Its output_flle 
procedure* 

del sort^Scommence entry(fixed bin<35))? 
call sort_$commence (code) ? 

where code is a standard Multics status code returned by 
sort_$commence* Possible values are listed below under the 
heading Status Codes* (Output) 

The following status codes may be returned by sort_$commence 
(all codes are in error_tab I e_) t 

Q Normal return (no errors) • 

fatal_error The Sort has encountered a fatal error during 

the sorting process* The Sort will have 
previously generated a specific error message 
and signalled the sub_error_ condition via 
the sub_err_ subroutine. 

out_of_sequence The call to sort_$commence is not in the 

sequence required £>y the Sort? e.g.* 

sort_$commence has been called before 
sort_$ini tiate. 
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Loin** sort_$terminate 

The sort_S ter minate entry point must be called after the 
driver of the Sort has completed its output_file procedure. See 
the entry point sor t_$ini t iate above* The call to 
sort_$ terminate informs the Sort that the current execution of 
the Sort is complete. Upon return from sort_$terminatet the 
caller can initiate another execution of the Sort. 



del sort_$terminate entry(fixed bin(35))» 

call sort_$terminate(code) » 

where code is a standard Multtics system status code returned by 
sor t__$terminate* Possible values are listed below under the 
heading Status Codes* (Output) 



The following status codes may be returned by 
sor t_$terminate (all codes are in error_tab I e_) t 

g Normal return (no errors) • 

out_of ..sequence The call to sor t_$terml nate is not in the 

sequence required by the Sort? e*g*» 
sort_$ terminate has been called before 
sort_$initiate* 
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